Şeyma Cakir - 2017402024

Oya Hoban - 2018402150

Özlem SENEL - 2017402117

knitr::opts_chunk$set(echo=FALSE, message=FALSE, warning=FALSE)

INTRODUCTION

In this project, it will be predicted some of products that are sold at an e-commerce platform called ‘Trendyol’. The sold count will be examined for each product and data will be decomposed. Then, some forecasting strategies will be developed and the best among them according to their weighted mean absolute errors will be picked. The data before 29 May 2021 will be train dataset for our models to learn and data from 29 May to 11 June 2021 will be test dataset. There are 9 products that it will be examined:

Since campaign dates are important for the sales and most peaks in sales happen during these times,as external data, campaign dates of the Trendyol is investigated and included as input attribute ‘is_campaign’. The data is taken from Trendyol’s website.

PRODUCT 1 - La Roche Posay Face Cleanser

Before making forecasting models, it should be looked at the plot of data and examined the seasonalities and trend. Below, you can see the plot of sales quantity of Product 1. There is a slightly increasing trend, especially in the middle of the plot. There can’t be seen any significant seasonality. To look further, there is a plot of 3 months of 2021 - March, April and May -. Again, the seasonality isn’t very significant but it is seen that the data is higher in the beginning of the month and decreases to the end of the month. It can be said that there is monthly seasonality.

Linear Regression Model For Product 1

First type of model that is going to used is linear regression model. First of all, it would be wise to select attributes that will help to model from correlation matrix. Below, you can see the correlations between the attributes. According to this matrix, category_sold, category_favored, and basket_count can be added to the model.

In the first model, the attributes are added to the model. The adjusted R-squared value indicates whether model is good or not. The value for the first model is pretty high which is a good sign. But there are outliers which is probably due to campaigns and holidays. The outliers can be eliminated for a better model. Lastly, ‘lag1’ attribute can be added because it is very high in the ACF. In the final linear regression model, adjusted R-squared value is high enough and plots are good enough to make predictions.

## 
## Call:
## lm(formula = sold_count ~ category_sold + category_favored + 
##     basket_count, data = sold)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -86.278 -11.238  -0.387   8.763 168.980 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       4.7442865  2.8040394   1.692   0.0915 .  
## category_sold     0.1187613  0.0062677  18.948  < 2e-16 ***
## category_favored -0.0015302  0.0002083  -7.347 1.34e-12 ***
## basket_count      0.1407651  0.0090971  15.474  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 25.46 on 365 degrees of freedom
## Multiple R-squared:  0.8403, Adjusted R-squared:  0.839 
## F-statistic: 640.4 on 3 and 365 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 140.82, df = 10, p-value < 2.2e-16

##    sold_count    
##  Min.   : 14.00  
##  1st Qu.: 33.00  
##  Median : 56.00  
##  Mean   : 74.17  
##  3rd Qu.: 89.00  
##  Max.   :447.00
## 
## Call:
## lm(formula = sold_count ~ big_outlier + category_sold + category_favored + 
##     basket_count, data = sold)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -80.651  -8.335  -1.034   8.277 121.209 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      11.5878617  2.3643596   4.901 1.44e-06 ***
## big_outlier      76.5329182  5.7826657  13.235  < 2e-16 ***
## category_sold     0.0867377  0.0056964  15.227  < 2e-16 ***
## category_favored -0.0008900  0.0001781  -4.998 9.01e-07 ***
## basket_count      0.1075103  0.0078954  13.617  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20.95 on 364 degrees of freedom
## Multiple R-squared:  0.8922, Adjusted R-squared:  0.891 
## F-statistic: 753.2 on 4 and 364 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 112.47, df = 10, p-value < 2.2e-16
## 
## Call:
## lm(formula = sold_count ~ lag1 + big_outlier + category_sold + 
##     category_favored + basket_count, data = sold)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -78.630  -7.746  -0.706   7.253 123.997 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       9.9269325  2.0130544   4.931 1.25e-06 ***
## lag1              0.5443102  0.0457488  11.898  < 2e-16 ***
## big_outlier      63.1752763  5.0382831  12.539  < 2e-16 ***
## category_sold     0.0940932  0.0048777  19.290  < 2e-16 ***
## category_favored -0.0009748  0.0001514  -6.438 3.84e-10 ***
## basket_count      0.1106151  0.0067112  16.482  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.79 on 363 degrees of freedom
## Multiple R-squared:  0.9225, Adjusted R-squared:  0.9214 
## F-statistic: 863.6 on 5 and 363 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 18.357, df = 10, p-value = 0.04924

Arima Model For Product 1

Second type of model that is going to build is ARIMA model. For this model, in the beginning, the data should be decomposed. Firstly, a frequency value should be chosen. Since there is no significant seasonality, the highest value in the ACF will be chosen which is 63. Additive type of decomposition will be used for this task. Below, the random series can be seen.

After the decomposition, (p,d,q) values should be chosen for the model. For this task, ACF and PACF will be examined. Looking at the ACF, for ‘q’ value 1 or 7 can be chosen and looking at the PACF, for ‘p’ value 1 can be chosen. Also, auto.arima function is used as well. The AIC and BIC values of models that are suggested can be seen below. Smaller AIC and BIC values means the model is better. So, looking at AIC and BIC values, (2,0,2) model that auto.arima is suggested is best among them. After the model is selected, the regressors that most correlates with the sold count are added to model to make it better. In the final model, the AIC and BIC values are lower. We can proceed with this model.

## 
## Call:
## arima(x = detrend, order = c(1, 0, 1))
## 
## Coefficients:
##          ar1     ma1  intercept
##       0.6650  0.0123    -1.5566
## s.e.  0.0574  0.0702     6.0436
## 
## sigma^2 estimated as 1244:  log likelihood = -1529.77,  aic = 3067.54
## [1] 3067.536
## [1] 3082.443
## 
## Call:
## arima(x = detrend, order = c(1, 0, 7))
## 
## Coefficients:
##          ar1      ma1      ma2      ma3      ma4      ma5      ma6      ma7
##       0.8658  -0.2496  -0.0680  -0.1138  -0.2193  -0.1632  -0.0457  -0.1405
## s.e.  0.0427   0.0696   0.0622   0.0643   0.0589   0.0551   0.0697   0.0702
##       intercept
##         -0.4768
## s.e.     0.5468
## 
## sigma^2 estimated as 1129:  log likelihood = -1516.43,  aic = 3052.87
## [1] 3052.868
## [1] 3090.136
## Series: detrend 
## ARIMA(2,0,2) with zero mean 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2
##       1.5221  -0.6871  -0.8673  0.1966
## s.e.  0.1703   0.0984   0.1811  0.0930
## 
## sigma^2 estimated as 1201:  log likelihood=-1522.43
## AIC=3054.86   AICc=3055.06   BIC=3073.5
## [1] 3054.864
## [1] 3073.498
## 
## Call:
## arima(x = detrend, order = c(2, 0, 2), xreg = xreg)
## 
## Coefficients:
##          ar1      ar2      ma1     ma2  intercept   xreg1   xreg2
##       0.8477  -0.1219  -0.1993  0.1917   -52.5534  0.1673  -2e-04
## s.e.  0.2838   0.2328   0.2780  0.0934     7.9501  0.0180   3e-04
## 
## sigma^2 estimated as 780.6:  log likelihood = -1458.35,  aic = 2932.71
## [1] 2932.707
## [1] 2962.521

Comparison Of Models

We selected two models for prediction. Here, it can be seen their accuracy values. According to box plot, the variance of weighted mean absolute errors for linear model is higher especially in the end. We should choose Arima model because WMAPE value of the model is lower which is a sign for better model.

##          variable  n     mean       sd        CV       FBias      MAPE     RMSE
## 1:  lm_prediction 14 83.35714 17.09074 0.2050303 -0.72352232 0.8010225 109.8228
## 2: selected_arima 14 83.35714 17.09074 0.2050303 -0.03885441 0.3287008  35.2479
##         MAD      MADP     WMAPE
## 1: 63.38325 0.7603817 0.7603817
## 2: 26.33523 0.3159325 0.3159325

For conclusion, here is a plot of actual test set and predicted values of chosen model. As it can be seen, the predictions are pretty accurate.

PRODUCT 2 - Sleepy Baby Wipes

Before making forecasting models for product 2, it should be looked at the plot of data and examined the seasonalities and trend. Below, you can see the plot of sales quantity of Product 2. There isn’t a significant trend as it can be seen. Also, there can’t be seen any significant seasonality. To look further, there is a plot of 3 months of 2021 - March, April and May -. Again, the seasonality isn’t significant, though it can be said there is a spike in the plot at the beginning of the month. In May, there is a big rising probably due to Covid-19 conditions. In conclusion, it can be said that there is monthly seasonality but it isn’t very clear.

Linear Regression Model For Product 2

First type of model that is going to used is linear regression model. First of all, it would be wise to select attributes that will help to model from correlation matrix. Below, you can see the correlations between the attributes. According to this matrix, category_sold, category_visits, and basket_count can be added to the model.

In the first model, the attributes are added to the model. The adjusted R-squared value indicates whether model is good or not. The value for the first model is pretty high which is a good sign. But there are outliers which is probably due to campaigns and holidays. The outliers can be eliminated for a better model. Lastly, ‘lag1’ attribute can be added because it is very high in the ACF. In the final linear regression model, adjusted R-squared value is high enough and plots are good enough to make predictions.

## 
## Call:
## lm(formula = sold_count ~ category_sold + category_visits + basket_count, 
##     data = sold)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -422.40  -60.15    1.95   63.20 1208.91 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -60.17090   11.42700  -5.266 2.39e-07 ***
## category_sold     0.14185    0.02200   6.449 3.58e-10 ***
## category_visits   0.00693    0.01256   0.552    0.581    
## basket_count      0.18780    0.01162  16.161  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 128.9 on 365 degrees of freedom
## Multiple R-squared:  0.9068, Adjusted R-squared:  0.906 
## F-statistic:  1183 on 3 and 365 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 125.12, df = 10, p-value < 2.2e-16

##    sold_count    
##  Min.   :  30.0  
##  1st Qu.: 165.0  
##  Median : 238.0  
##  Mean   : 381.4  
##  3rd Qu.: 431.0  
##  Max.   :4191.0
## 
## Call:
## lm(formula = sold_count ~ big_outlier + category_sold + category_visits + 
##     basket_count, data = sold)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -356.35  -52.28   10.07   53.54 1315.86 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -2.148e+01  1.241e+01  -1.730   0.0845 .  
## big_outlier      2.303e+02  3.592e+01   6.410 4.51e-10 ***
## category_sold    1.425e-01  2.088e-02   6.824 3.71e-11 ***
## category_visits -4.873e-04  1.198e-02  -0.041   0.9676    
## basket_count     1.477e-01  1.268e-02  11.655  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 122.4 on 364 degrees of freedom
## Multiple R-squared:  0.9162, Adjusted R-squared:  0.9153 
## F-statistic: 995.2 on 4 and 364 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 95.607, df = 10, p-value = 4.11e-16
## 
## Call:
## lm(formula = sold_count ~ lag1 + big_outlier + category_sold + 
##     category_visits + basket_count, data = sold)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -381.58  -37.12    4.89   39.84 1334.45 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -40.28635   11.39508  -3.535  0.00046 ***
## lag1              0.44599    0.04880   9.140  < 2e-16 ***
## big_outlier     178.62606   32.91952   5.426 1.05e-07 ***
## category_sold     0.13014    0.01890   6.886 2.54e-11 ***
## category_visits   0.01271    0.01091   1.165  0.24494    
## basket_count      0.15168    0.01145  13.244  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 110.5 on 363 degrees of freedom
## Multiple R-squared:  0.9319, Adjusted R-squared:  0.931 
## F-statistic: 993.4 on 5 and 363 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 74.502, df = 10, p-value = 5.947e-12

Arima Model For Product 2

Second type of model that is going to build is ARIMA model. For this model, in the beginning, the data should be decomposed. Firstly, a frequency value should be chosen. Since there is no significant seasonality, the highest value in the ACF will be chosen which is 34. Additive type of decomposition will be used for this task. Below, the random series can be seen.

After the decomposition, (p,d,q) values should be chosen for the model. For this task, ACF and PACF will be examined. Looking at the ACF, for ‘q’ value 1 or 11 can be chosen and looking at the PACF, for ‘p’ value 1 can be chosen. Also, auto.arima function is used as well. The AIC and BIC values of models that are suggested can be seen below. Looking at AIC and BIC values, (1,0,11) model is best among them. After the model is selected, the regressors that most correlates with the sold count are added to model to make it better. In the final model, the AIC and BIC values are lower. We can proceed with this model.

## 
## Call:
## arima(x = detrend, order = c(1, 0, 1))
## 
## Coefficients:
##          ar1     ma1  intercept
##       0.5985  0.1204    -2.2120
## s.e.  0.0598  0.0686    45.0812
## 
## sigma^2 estimated as 88277:  log likelihood = -2383.17,  aic = 4774.34
## [1] 4774.343
## [1] 4789.6
## 
## Call:
## arima(x = detrend, order = c(1, 0, 11))
## 
## Coefficients:
##          ar1     ma1     ma2      ma3      ma4      ma5      ma6      ma7
##       0.5115  0.0898  0.0048  -0.1392  -0.1806  -0.2103  -0.1589  -0.1076
## s.e.  0.2066  0.2088  0.1286   0.0770   0.0556   0.0745   0.0945   0.0925
##           ma8      ma9     ma10     ma11  intercept
##       -0.0942  -0.0735  -0.0572  -0.0731     0.3060
## s.e.   0.0784   0.0771   0.0727   0.0640     2.0291
## 
## sigma^2 estimated as 76841:  log likelihood = -2361.76,  aic = 4751.51
## [1] 4751.515
## [1] 4804.913
## Series: detrend 
## ARIMA(3,0,0) with zero mean 
## 
## Coefficients:
##          ar1      ar2      ar3
##       0.7228  -0.0081  -0.1412
## s.e.  0.0540   0.0669   0.0539
## 
## sigma^2 estimated as 86941:  log likelihood=-2379.15
## AIC=4766.29   AICc=4766.41   BIC=4781.55
## [1] 4766.292
## [1] 4781.549
## 
## Call:
## arima(x = detrend, order = c(1, 0, 11), xreg = xreg)
## 
## Coefficients:
##          ar1     ma1    ma2     ma3     ma4    ma5     ma6     ma7     ma8
##       0.5558  0.1483  0.178  0.1079  0.0327  8e-04  0.0653  0.0634  0.0101
## s.e.     NaN     NaN    NaN     NaN     NaN    NaN     NaN     NaN     NaN
##          ma9    ma10    ma11  intercept   xreg1   xreg2   xreg3
##       0.0076  0.0436  0.0388  -450.0970  0.1404  0.0732  0.0487
## s.e.     NaN  0.0533  0.0598    33.0371  0.0164  0.0184  0.0316
## 
## sigma^2 estimated as 19786:  log likelihood = -2132.8,  aic = 4299.6
## [1] 4299.597
## [1] 4364.438

Comparison Of Models

We selected two models for prediction. Here, it can be seen their accuracy values. According to box plot, the weighted mean absolute errors for Arima model is higher. We should choose Linear model because WMAPE value of the model is lower which is a sign for better model.

##          variable  n     mean      sd        CV      FBias      MAPE     RMSE
## 1:  lm_prediction 14 542.4286 335.978 0.6193958 -0.1358889 0.2050354 263.4115
## 2: selected_arima 14 542.4286 335.978 0.6193958  0.8441860 0.8331456 649.5670
##         MAD      MADP     WMAPE
## 1: 115.9278 0.2137200 0.2137200
## 2: 512.1721 0.9442203 0.9442203

For conclusion, here is a plot of actual test set and predicted values of chosen model. As it can be seen, the predictions are pretty accurate.

PRODUCT 3 - Xiaomi Bluetooth Headphones

At below,looking at the plots of the product; in line graph it can be observed that the sales have variance, in some dates the plot has peaks and also there might be a cyclical behaviour which is an indicator for seasonality. For further investigation, ‘3 Months Sales of 2021’ plot can be examined, there is not clear repeating pattern that can be easily observed.

Looking at the boxplots; in the weekly boxplot the sales is weekdays seem to be similar, daily and weekly seasonaity can be investigated. In monthly boxplot, there is change with respect to months, there is no clear repeating monthly behaviour. In histograms, one can observe that the sales’ distribution is close to normal distribution.

Trying Different ARIMA Models for Product 3 - 6676673

Firstly, different ARIMA models can be built in order to test different models on the test set. For this purpose, before building an ARIMA model, the data should be decomposed,a frequency value should be chosen. 30 and 7 day frequency can be selected and the data can be decomposed accordingly. Along with 30 and 7 day frequency, ACF plot of the data can be examined and in the lag that we see high autocorrelation it can be chosen as another trial frequency to decompose. Since variance don’t seem to be increasing, additive type of decomposition can be used for decomposition. Below, the random series can be seen.

Decomposition with 7 Day Freq

Decomposition with 7 Day Freq

The above decomposition series belong to time series with 7 and 30 days frequency, respectively.

Looking at the ACF plot of the series, highest ACF value belongs to lag 32, so time series decomposition with 32 day frequency would be sufficient.

In time series decomposition, it is assumed that the random part is randomly distributed with mean zero and standard deviation 1; in order to decide on the best frequency, the random part of the decomposed series should be observed. In this case, the random part of the decomposed time series with 7 day frequency seem to be closer to randomly distributed series with mean zero and std dev 1, so it is chosen as the final decomposition.

After the decomposition, (p,d,q) values should be chosen for the model. For this task, ACF and PACF will be examined.For q, peaks at ACF function can be chosen and for p values, peaks at PACF function can be chosen. Looking at the ACF, for ‘q’ value 3 or 4 may be selected and looking at the PACF, for ‘p’ value 3 or 9 may be selected. Also, auto.arima function is used as well. The AIC and BIC values of models that are suggested can be seen below. Smaller AIC and BIC values means the model is better. So, looking at AIC and BIC values, (3,0,4) model that auto.arima has suggested is best among them.

## 
## Call:
## arima(x = detrend, order = c(3, 0, 3))
## 
## Coefficients:
##          ar1     ar2      ar3      ma1      ma2      ma3  intercept
##       0.3628  0.1249  -0.3545  -0.5395  -0.4382  -0.0223    -0.0160
## s.e.  0.1570  0.2211   0.1415   0.1652   0.2446   0.2010     0.0715
## 
## sigma^2 estimated as 9123:  log likelihood = -2376.31,  aic = 4768.62
## [1] 4768.62
## [1] 4800.491
## 
## Call:
## arima(x = detrend, order = c(3, 0, 4))
## 
## Coefficients:
##          ar1      ar2     ar3      ma1     ma2      ma3     ma4  intercept
##       1.0944  -0.3181  0.0127  -1.3286  0.1373  -0.2304  0.4219    -0.0193
## s.e.  0.1386   0.1984  0.0992   0.1290  0.2123   0.1199  0.0750     0.0142
## 
## sigma^2 estimated as 8476:  log likelihood = -2363.65,  aic = 4745.3
## [1] 4745.295
## [1] 4781.151
## 
## Call:
## arima(x = detrend, order = c(9, 0, 4))
## 
## Coefficients:
##          ar1     ar2     ar3      ar4     ar5     ar6      ar7     ar8      ar9
##       0.5188  0.3226  0.0766  -0.4315  0.2393  0.0072  -0.0727  0.1220  -0.0980
## s.e.  0.6411  0.5899  0.3413   0.5940  0.2110  0.2125   0.2393  0.0627   0.1065
##           ma1      ma2      ma3     ma4  intercept
##       -0.7461  -0.6490  -0.3833  0.7785    -0.0194
## s.e.   0.6119   0.7234   0.4355  0.5112     0.0133
## 
## sigma^2 estimated as 8335:  log likelihood = -2360.63,  aic = 4751.26
## [1] 4751.256
## [1] 4811.016
## 
##  Fitting models using approximations to speed things up...
## 
##  ARIMA(2,0,2)           with non-zero mean : 4793.392
##  ARIMA(0,0,0)           with non-zero mean : 4925.4
##  ARIMA(1,0,0)           with non-zero mean : 4919.53
##  ARIMA(0,0,1)           with non-zero mean : 4916.887
##  ARIMA(0,0,0)           with zero mean     : 4923.38
##  ARIMA(1,0,2)           with non-zero mean : Inf
##  ARIMA(2,0,1)           with non-zero mean : 4793.637
##  ARIMA(3,0,2)           with non-zero mean : Inf
##  ARIMA(2,0,3)           with non-zero mean : Inf
##  ARIMA(1,0,1)           with non-zero mean : 4919.128
##  ARIMA(1,0,3)           with non-zero mean : Inf
##  ARIMA(3,0,1)           with non-zero mean : Inf
##  ARIMA(3,0,3)           with non-zero mean : Inf
##  ARIMA(2,0,2)           with zero mean     : 4791.758
##  ARIMA(1,0,2)           with zero mean     : 4818.262
##  ARIMA(2,0,1)           with zero mean     : 4792.121
##  ARIMA(3,0,2)           with zero mean     : Inf
##  ARIMA(2,0,3)           with zero mean     : Inf
##  ARIMA(1,0,1)           with zero mean     : 4917.087
##  ARIMA(1,0,3)           with zero mean     : Inf
##  ARIMA(3,0,1)           with zero mean     : Inf
##  ARIMA(3,0,3)           with zero mean     : Inf
## 
##  Now re-fitting the best model(s) without approximations...
## 
##  ARIMA(2,0,2)           with zero mean     : Inf
##  ARIMA(2,0,1)           with zero mean     : Inf
##  ARIMA(2,0,2)           with non-zero mean : Inf
##  ARIMA(2,0,1)           with non-zero mean : Inf
##  ARIMA(1,0,2)           with zero mean     : Inf
##  ARIMA(0,0,1)           with non-zero mean : 4916.897
## 
##  Best model: ARIMA(0,0,1)           with non-zero mean
## Series: detrend 
## ARIMA(0,0,1) with non-zero mean 
## 
## Coefficients:
##          ma1     mean
##       0.1699  -0.1507
## s.e.  0.0485   6.8933
## 
## sigma^2 estimated as 13863:  log likelihood=-2455.42
## AIC=4916.84   AICc=4916.9   BIC=4928.79
## [1] 4916.836
## [1] 4928.787

Trying Different Linear Regression Models For Product 3

The second type of model that is going to used is linear regression model. Below, you can see the correlations between the attributes. According to this matrix, basket_count, price_count, visit_count and favored_count can be added to the model. since ,above, in the box plots, it has been observed that there is monthly change in the data, so month information can also be added to the candidate models.

Comparison of the Linear Regression and ARIMA Models for Product 3

Different linear regression models and ARIMA models’ performance on the test dates will be calculated and according to their performance, best model can be selected.

##             variable  n     mean       sd        CV       FBias       MAPE
## 1:    lm_prediction2 14 451.5714 90.71063 0.2008777 -0.02562712 0.09325325
## 2:    lm_prediction3 14 451.5714 90.71063 0.2008777 -0.07674217 0.11904792
## 3:    lm_prediction4 14 451.5714 90.71063 0.2008777 -0.08393829 0.11670367
## 4:    lm_prediction5 14 451.5714 90.71063 0.2008777 -0.11437066 0.12859235
## 5:    lm_prediction6 14 451.5714 90.71063 0.2008777 -0.03526619 0.07687833
## 6:    lm_prediction7 14 451.5714 90.71063 0.2008777 -0.10621457 0.12427903
## 7:  arima_prediction 14 451.5714 90.71063 0.2008777  0.05141121 0.12779687
## 8: sarima_prediction 14 451.5714 90.71063 0.2008777  0.05256333 0.12798436
## 9:    selected_arima 14 451.5714 90.71063 0.2008777  0.09418716 0.17941751
##         RMSE      MAD       MADP      WMAPE
## 1:  49.35944 40.27963 0.08919881 0.08919881
## 2:  58.72637 50.19184 0.11114927 0.11114927
## 3:  59.66218 49.63226 0.10991009 0.10991009
## 4:  65.04994 56.12976 0.12429875 0.12429875
## 5:  42.38769 32.55791 0.07209913 0.07209913
## 6:  61.13382 53.07757 0.11753969 0.11753969
## 7:  77.45611 61.04713 0.13518821 0.13518821
## 8:  77.46723 61.18399 0.13549128 0.13549128
## 9: 100.82860 81.07444 0.17953847 0.17953847

Smallest Weighted Mean Absolute Percentage Error is obtained for the linear regression model ‘sold_count~basket_count + visit_count + as.factor(mon)+ as.factor(is_campaign)’, so further on this model is selected for our prediction purposes.

For conclusion, here is a plot of actual test set and predicted values of chosen model. As it can be seen, the predictions are pretty accurate.

## One Day Ahead Prediction with the Selected Model for Product 3

With the selected model, 1 day ahead prediction can be performed using all the data on hand, since in this competition one day ahead prediction should be submitted.

##     price event_date product_content_id sold_count visit_count favored_count
## 1: 119.66 2021-07-01            6676673        312       11562           777
##    basket_count category_sold category_brand_sold category_visits ty_visits
## 1:          930          4839                 752          256832 106491398
##    category_basket category_favored w_day mon is_campaign
## 1:           21667            19158     5   7           0
##     price event_date product_content_id sold_count visit_count favored_count
## 1: 119.66 2021-07-03            6676673        312       11562           777
##    basket_count category_sold category_brand_sold category_visits ty_visits
## 1:          930          4839                 752          256832 106491398
##    category_basket category_favored w_day mon is_campaign lm_prediction
## 1:           21667            19158     5   7           0      366.5632

PRODUCT 4 - Fakir Vacuum Cleaner

At below,looking at the plots of the product; in line graph it can be observed that the sales have variance, in some dates the plot has high outliers and also there might be a cyclical behaviour which is an indicator for seasonality. For further investigation, ‘3 Months Sales of 2021’ plot can be examined, there is not clear repeating pattern that can be easily observed.

Looking at the boxplots; in the weekly boxplot the sales is weekdays seem to be similar, daily and weekly seasonaity can be investigated. In monthly boxplot, there is change with respect to months, there is no clear repeating monthly behaviour. In histograms, one can observe that the sales’ distribution is close to normal distribution.

Trying Different ARIMA Models for Product 4 - 7061886

Firstly, different ARIMA models can be built in order to test different models on the test set. 30 and 7 day frequency can be selected and the data can be decomposed accordingly. Since variance don’t seem to be increasing, additive type of decomposition can be used for decomposition. Below, the random series can be seen.

The above decomposition series belong to time series with 7 and 30 days frequency, respectively. Looking at the ACF plot of the series, highest ACF value belongs to lag 16, so time series decomposition with 16 day frequency would be sufficient.

In this case, the random part of the decomposed time series with 16 day frequency seem to be closer to randomly distributed series with mean zero and std dev 1, so it is chosen as the final decomposition.

Looking at the ACF, for ‘q’ value 5 or 7 may be selected and looking at the PACF, for ‘p’ value 1 or 3 may be selected. Also, auto.arima function is used as well. The AIC and BIC values of models that are suggested can be seen below. So, looking at AIC and BIC values, ARIMA(3,0,5) model that is selected with observing the ACF and PACF plots, ARIMA(3,0,5) model’s AIC value is smaller than the ARIMA(1,0,2) model’s AIC value which is suggested by auto arima. For performance comparison with linear models, ARIMA(3,0,5) will be used.

## 
## Call:
## arima(x = detrend, order = c(3, 0, 7))
## 
## Coefficients:
##         ar1     ar2      ar3    ma1      ma2     ma3      ma4     ma5     ma6
##       0.829  0.6108  -0.5578  -0.57  -0.7939  0.0527  -0.0214  0.1489  0.0724
## s.e.    NaN     NaN      NaN    NaN      NaN     NaN   0.0743  0.0730  0.0317
##          ma7  intercept
##       0.1115    -0.0971
## s.e.  0.0542     0.0720
## 
## sigma^2 estimated as 13975:  log likelihood = -2400.34,  aic = 4824.68
## [1] 4824.677
## [1] 4872.178
## 
## Call:
## arima(x = detrend, order = c(3, 0, 5))
## 
## Coefficients:
##          ar1     ar2     ar3      ma1      ma2     ma3     ma4     ma5
##       0.7748  0.8682  -0.759  -0.5212  -1.0612  0.1521  0.1158  0.3146
## s.e.     NaN     NaN     NaN      NaN      NaN  0.0821  0.0583  0.0507
##       intercept
##         -0.0893
## s.e.     0.0823
## 
## sigma^2 estimated as 14210:  log likelihood = -2403.47,  aic = 4826.94
## [1] 4826.937
## [1] 4866.521
## 
## Call:
## arima(x = detrend, order = c(1, 0, 5))
## 
## Coefficients:
##          ar1      ma1      ma2      ma3      ma4      ma5  intercept
##       0.5901  -0.2768  -0.0907  -0.2992  -0.1936  -0.1397    -0.0462
## s.e.  0.0721   0.0779   0.0548   0.0589   0.0598   0.0596     0.3752
## 
## sigma^2 estimated as 14996:  log likelihood = -2411.97,  aic = 4839.94
## [1] 4839.942
## [1] 4871.609
## 
##  Fitting models using approximations to speed things up...
## 
##  ARIMA(2,0,2)            with non-zero mean : 4853.412
##  ARIMA(0,0,0)            with non-zero mean : 5003.415
##  ARIMA(1,0,0)            with non-zero mean : 4904.066
##  ARIMA(0,0,1)            with non-zero mean : 4925.875
##  ARIMA(0,0,0)            with zero mean     : 5001.402
##  ARIMA(1,0,2)            with non-zero mean : 4893.675
##  ARIMA(2,0,1)            with non-zero mean : 4907.233
##  ARIMA(3,0,2)            with non-zero mean : Inf
##  ARIMA(2,0,3)            with non-zero mean : 4845.681
##  ARIMA(1,0,3)            with non-zero mean : 4894.608
##  ARIMA(3,0,3)            with non-zero mean : Inf
##  ARIMA(2,0,4)            with non-zero mean : 4847.091
##  ARIMA(1,0,4)            with non-zero mean : Inf
##  ARIMA(3,0,4)            with non-zero mean : Inf
##  ARIMA(2,0,3)            with zero mean     : 4843.93
##  ARIMA(1,0,3)            with zero mean     : 4892.558
##  ARIMA(2,0,2)            with zero mean     : 4851.59
##  ARIMA(3,0,3)            with zero mean     : Inf
##  ARIMA(2,0,4)            with zero mean     : 4845.366
##  ARIMA(1,0,2)            with zero mean     : 4891.627
##  ARIMA(1,0,4)            with zero mean     : Inf
##  ARIMA(3,0,2)            with zero mean     : Inf
##  ARIMA(3,0,4)            with zero mean     : Inf
## 
##  Now re-fitting the best model(s) without approximations...
## 
##  ARIMA(2,0,3)            with zero mean     : Inf
##  ARIMA(2,0,4)            with zero mean     : Inf
##  ARIMA(2,0,3)            with non-zero mean : Inf
##  ARIMA(2,0,4)            with non-zero mean : Inf
##  ARIMA(2,0,2)            with zero mean     : Inf
##  ARIMA(2,0,2)            with non-zero mean : Inf
##  ARIMA(1,0,2)            with zero mean     : 4891.046
## 
##  Best model: ARIMA(1,0,2)            with zero mean
## Series: detrend 
## ARIMA(1,0,2) with zero mean 
## 
## Coefficients:
##          ar1     ma1     ma2
##       0.1257  0.3647  0.2845
## s.e.  0.1452  0.1389  0.0696
## 
## sigma^2 estimated as 17790:  log likelihood=-2441.47
## AIC=4890.94   AICc=4891.05   BIC=4906.78
## [1] 4890.942
## [1] 4906.775

Trying Different Linear Regression Models For Product 4

Below, you can see the correlations between the attributes. According to this matrix, basket_count, category_favored, is_campaign and category_sold can be added to the model, with different combinations. Since ,above, in the box plots, it has been observed that there is monthly change in the data, so month information can also be added to the candidate models.

Comparison of the Linear Regression and ARIMA Models for Product 4

Different linear regression models and ARIMA models’ performance on the test dates will be calculated and according to their performance, best model can be selected.

##             variable  n mean       sd        CV       FBias      MAPE      RMSE
## 1:    lm_prediction1 14   21 7.200427 0.3428775 -0.20693883 0.2697431  5.966694
## 2:    lm_prediction2 14   21 7.200427 0.3428775 -2.97927177 3.4236791 75.719475
## 3:    lm_prediction3 14   21 7.200427 0.3428775 -3.28869474 3.8131788 83.123744
## 4:    lm_prediction4 14   21 7.200427 0.3428775 -3.05884773 3.5175993 76.628455
## 5:    lm_prediction5 14   21 7.200427 0.3428775 -0.35648820 0.3872554 13.486489
## 6:    lm_prediction6 14   21 7.200427 0.3428775 -2.81391925 3.2181353 71.004414
## 7:  arima_prediction 14   21 7.200427 0.3428775 -0.09014912 0.2865406  7.276734
## 8: sarima_prediction 14   21 7.200427 0.3428775  0.02528538 0.2798477  7.197155
## 9:    selected_arima 14   21 7.200427 0.3428775  0.10692728 0.3737146  9.239168
##          MAD      MADP     WMAPE
## 1:  5.193758 0.2473218 0.2473218
## 2: 62.564707 2.9792718 2.9792718
## 3: 69.062590 3.2886947 3.2886947
## 4: 64.235802 3.0588477 3.0588477
## 5:  8.640406 0.4114479 0.4114479
## 6: 59.092304 2.8139193 2.8139193
## 7:  5.722414 0.2724959 0.2724959
## 8:  5.455298 0.2597761 0.2597761
## 9:  7.505308 0.3573956 0.3573956

Smallest Weighted Mean Absolute Percentage Error is obtained for the linear regression model ‘sold_count~basket_count +as.factor(mon)’,but, since it has 2 input attributes, when one of them increases slightly, its effect will be much more impactful, so it has been chosen to continue with the model that has second smallest WMAPE , ARIMA(1,1,4) with decomposed series with 16 day frequency,and it is the model that auto arima suggested. So further on this model is selected for our prediction purposes.

For conclusion, here is a plot of actual test set and predicted values of chosen model. As it can be seen, the predictions are not too far.

One Day Ahead Prediction with the Selected Model for Product 4

With the selected model, 1 day ahead prediction can be performed using all the data on hand, since in this competition one day ahead prediction should be submitted.

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.0067 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739
## 
## Call:
## arima(x = detrend1, order = c(1, 1, 4), xreg = data_7061886$is_campaign)
## 
## Coefficients:
##           ar1      ma1      ma2      ma3      ma4  data_7061886$is_campaign
##       -0.0681  -0.3684  -0.1174  -0.1867  -0.3275                   21.6371
## s.e.   0.1656   0.1552   0.1043   0.0565   0.0649                    5.4080
## 
## sigma^2 estimated as 453.3:  log likelihood = -1730.7,  aic = 3475.4
## [1] 3475.401
## [1] 3503.092
##    price event_date product_content_id sold_count visit_count favored_count
## 1: 297.9 2021-07-03            7061886         15        1074           103
##    basket_count category_sold category_brand_sold category_visits ty_visits
## 1:           51           886                 184           64930 106491398
##    category_basket category_favored w_day mon is_campaign arima1_prediction
## 1:            3324             5648     5   7           0          4.456416

PRODUCT 5 - TrendyolMilla Tights

At below,looking at the plots of the product; in line graph it can be observed that the sales have increasing variance, in some dates the plot has high outliers and also there might be a cyclical behaviour which is an indicator for seasonality. For further investigation, ‘3 Months Sales of 2021’ plot can be examined, there is not clear repeating pattern that can be easily observed.

Looking at the boxplots; in the weekly boxplot the sales is weekdays seem to be similar, daily and weekly seasonaity can be investigated. In monthly boxplot, there is change with respect to months, however median of the months seem to be close to each other, this may be an indicator for monthly seasonality. In histograms, one can observe that the sales’ distribution is close to normal distribution.

Trying Different ARIMA Models for Product 5 - 31515569

Firstly, different ARIMA models can be built in order to test different models on the test set. 30 and 7 day frequency can be selected and the data can be decomposed accordingly. Since variance seem to be increasing, multiplicative type of decomposition can be used for decomposition. Below, the random series can be seen.

The above decomposition series belong to time series with 7 and 30 days frequency, respectively. Looking at the ACF plot of the series, highest ACF value belongs to lag 16, so time series decomposition with 16 day frequency would be sufficient.

In this case, the random part of the decomposed time series with 16 day frequency seem to be closer to randomly distributed series with mean zero and std dev 1, so it is chosen as the final decomposition.

Looking at the ACF, for ‘q’ value 2,5 or 8 may be selected and looking at the PACF, for ‘p’ value 3 or 4 may be selected. Also, auto.arima function is used as well. The AIC and BIC values of models that are suggested can be seen below. So, looking at AIC and BIC values, ARIMA(3,0,5) model that is selected with observing the ACF and PACF plots, ARIMA(3,0,5) model’s AIC value is smaller than the ARIMA(1,0,3) model’s AIC value which is suggested by auto arima. For performance comparison with linear models, ARIMA(3,0,5) will be used. ARIMA(3,0,5) best.

Trying Different Linear Regression Models For Product 5

Below, you can see the correlations between the attributes. According to this matrix, basket_count, favored_count, is_campaign and category_sold can be added to the model, with different combinations. Since ,above, in the box plots, it has been observed that there is monthly change in the data, so month information can also be added to the candidate models.

Comparison of the Linear Regression and ARIMA Models for Product 5

Different linear regression models and ARIMA models’ performance on the test dates will be calculated and according to their performance, best model can be selected.

## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6394.698
##  ARIMA(0,0,0) with non-zero mean : 6220.27
##  ARIMA(0,0,1) with zero mean     : 6126.411
##  ARIMA(0,0,1) with non-zero mean : 6009.282
##  ARIMA(0,0,2) with zero mean     : 6043.806
##  ARIMA(0,0,2) with non-zero mean : 5957.195
##  ARIMA(0,0,3) with zero mean     : 5942.598
##  ARIMA(0,0,3) with non-zero mean : 5884.053
##  ARIMA(0,0,4) with zero mean     : 5921.67
##  ARIMA(0,0,4) with non-zero mean : 5877.716
##  ARIMA(0,0,5) with zero mean     : 5918.286
##  ARIMA(0,0,5) with non-zero mean : 5879.596
##  ARIMA(1,0,0) with zero mean     : 5928.848
##  ARIMA(1,0,0) with non-zero mean : 5911.463
##  ARIMA(1,0,1) with zero mean     : 5929.506
##  ARIMA(1,0,1) with non-zero mean : 5909.434
##  ARIMA(1,0,2) with zero mean     : 5926.647
##  ARIMA(1,0,2) with non-zero mean : 5903.617
##  ARIMA(1,0,3) with zero mean     : 5911.226
##  ARIMA(1,0,3) with non-zero mean : 5877.901
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 5879.617
##  ARIMA(2,0,0) with zero mean     : 5929.216
##  ARIMA(2,0,0) with non-zero mean : 5907.817
##  ARIMA(2,0,1) with zero mean     : 5930.771
##  ARIMA(2,0,1) with non-zero mean : 5902.491
##  ARIMA(2,0,2) with zero mean     : 5925.483
##  ARIMA(2,0,2) with non-zero mean : 5891.825
##  ARIMA(2,0,3) with zero mean     : 5911.948
##  ARIMA(2,0,3) with non-zero mean : 5879.561
##  ARIMA(3,0,0) with zero mean     : 5928.15
##  ARIMA(3,0,0) with non-zero mean : 5900.006
##  ARIMA(3,0,1) with zero mean     : 5930.061
##  ARIMA(3,0,1) with non-zero mean : 5899.854
##  ARIMA(3,0,2) with zero mean     : 5933.983
##  ARIMA(3,0,2) with non-zero mean : 5887.709
##  ARIMA(4,0,0) with zero mean     : 5929.24
##  ARIMA(4,0,0) with non-zero mean : 5894.763
##  ARIMA(4,0,1) with zero mean     : 5925.197
##  ARIMA(4,0,1) with non-zero mean : 5891.308
##  ARIMA(5,0,0) with zero mean     : 5906.657
##  ARIMA(5,0,0) with non-zero mean : 5884.985
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## 
##  ARIMA(0,0,0)            with zero mean     : 6394.698
##  ARIMA(0,0,0)            with non-zero mean : 6220.27
##  ARIMA(0,0,0)(0,0,1)[16] with zero mean     : 6294.138
##  ARIMA(0,0,0)(0,0,1)[16] with non-zero mean : 6182.676
##  ARIMA(0,0,0)(0,0,2)[16] with zero mean     : 6267.205
##  ARIMA(0,0,0)(0,0,2)[16] with non-zero mean : 6181.942
##  ARIMA(0,0,0)(1,0,0)[16] with zero mean     : 6247.732
##  ARIMA(0,0,0)(1,0,0)[16] with non-zero mean : 6180.785
##  ARIMA(0,0,0)(1,0,1)[16] with zero mean     : 6236.257
##  ARIMA(0,0,0)(1,0,1)[16] with non-zero mean : 6182.301
##  ARIMA(0,0,0)(1,0,2)[16] with zero mean     : Inf
##  ARIMA(0,0,0)(1,0,2)[16] with non-zero mean : 6183.096
##  ARIMA(0,0,0)(2,0,0)[16] with zero mean     : 6244.526
##  ARIMA(0,0,0)(2,0,0)[16] with non-zero mean : 6182.286
##  ARIMA(0,0,0)(2,0,1)[16] with zero mean     : Inf
##  ARIMA(0,0,0)(2,0,1)[16] with non-zero mean : Inf
##  ARIMA(0,0,0)(2,0,2)[16] with zero mean     : Inf
##  ARIMA(0,0,0)(2,0,2)[16] with non-zero mean : Inf
##  ARIMA(0,0,1)            with zero mean     : 6126.411
##  ARIMA(0,0,1)            with non-zero mean : 6009.282
##  ARIMA(0,0,1)(0,0,1)[16] with zero mean     : 6069.038
##  ARIMA(0,0,1)(0,0,1)[16] with non-zero mean : 5987.659
##  ARIMA(0,0,1)(0,0,2)[16] with zero mean     : 6052.357
##  ARIMA(0,0,1)(0,0,2)[16] with non-zero mean : 5987.348
##  ARIMA(0,0,1)(1,0,0)[16] with zero mean     : 6043.194
##  ARIMA(0,0,1)(1,0,0)[16] with non-zero mean : 5985.413
##  ARIMA(0,0,1)(1,0,1)[16] with zero mean     : 6028.021
##  ARIMA(0,0,1)(1,0,1)[16] with non-zero mean : 5987.431
##  ARIMA(0,0,1)(1,0,2)[16] with zero mean     : Inf
##  ARIMA(0,0,1)(1,0,2)[16] with non-zero mean : 5989.218
##  ARIMA(0,0,1)(2,0,0)[16] with zero mean     : 6037.264
##  ARIMA(0,0,1)(2,0,0)[16] with non-zero mean : 5987.43
##  ARIMA(0,0,1)(2,0,1)[16] with zero mean     : Inf
##  ARIMA(0,0,1)(2,0,1)[16] with non-zero mean : 5989.3
##  ARIMA(0,0,1)(2,0,2)[16] with zero mean     : Inf
##  ARIMA(0,0,1)(2,0,2)[16] with non-zero mean : 5991.24
##  ARIMA(0,0,2)            with zero mean     : 6043.806
##  ARIMA(0,0,2)            with non-zero mean : 5957.195
##  ARIMA(0,0,2)(0,0,1)[16] with zero mean     : 6001.241
##  ARIMA(0,0,2)(0,0,1)[16] with non-zero mean : 5939.246
##  ARIMA(0,0,2)(0,0,2)[16] with zero mean     : 5992.86
##  ARIMA(0,0,2)(0,0,2)[16] with non-zero mean : 5940.305
##  ARIMA(0,0,2)(1,0,0)[16] with zero mean     : 5986.603
##  ARIMA(0,0,2)(1,0,0)[16] with non-zero mean : 5938.173
##  ARIMA(0,0,2)(1,0,1)[16] with zero mean     : 5977.252
##  ARIMA(0,0,2)(1,0,1)[16] with non-zero mean : 5940.24
##  ARIMA(0,0,2)(1,0,2)[16] with zero mean     : Inf
##  ARIMA(0,0,2)(1,0,2)[16] with non-zero mean : 5942.317
##  ARIMA(0,0,2)(2,0,0)[16] with zero mean     : 5983.547
##  ARIMA(0,0,2)(2,0,0)[16] with non-zero mean : 5940.24
##  ARIMA(0,0,2)(2,0,1)[16] with zero mean     : Inf
##  ARIMA(0,0,2)(2,0,1)[16] with non-zero mean : 5942.318
##  ARIMA(0,0,3)            with zero mean     : 5942.598
##  ARIMA(0,0,3)            with non-zero mean : 5884.053
##  ARIMA(0,0,3)(0,0,1)[16] with zero mean     : 5917.294
##  ARIMA(0,0,3)(0,0,1)[16] with non-zero mean : 5873.652
##  ARIMA(0,0,3)(0,0,2)[16] with zero mean     : 5915.625
##  ARIMA(0,0,3)(0,0,2)[16] with non-zero mean : 5875.614
##  ARIMA(0,0,3)(1,0,0)[16] with zero mean     : 5911.389
##  ARIMA(0,0,3)(1,0,0)[16] with non-zero mean : 5873.482
##  ARIMA(0,0,3)(1,0,1)[16] with zero mean     : 5904.144
##  ARIMA(0,0,3)(1,0,1)[16] with non-zero mean : 5875.549
##  ARIMA(0,0,3)(2,0,0)[16] with zero mean     : 5910.439
##  ARIMA(0,0,3)(2,0,0)[16] with non-zero mean : 5875.553
##  ARIMA(0,0,4)            with zero mean     : 5921.67
##  ARIMA(0,0,4)            with non-zero mean : 5877.716
##  ARIMA(0,0,4)(0,0,1)[16] with zero mean     : 5902.592
##  ARIMA(0,0,4)(0,0,1)[16] with non-zero mean : 5868.317
##  ARIMA(0,0,4)(1,0,0)[16] with zero mean     : 5898.323
##  ARIMA(0,0,4)(1,0,0)[16] with non-zero mean : 5867.987
##  ARIMA(0,0,5)            with zero mean     : 5918.286
##  ARIMA(0,0,5)            with non-zero mean : 5879.596
##  ARIMA(1,0,0)            with zero mean     : 5928.848
##  ARIMA(1,0,0)            with non-zero mean : 5911.463
##  ARIMA(1,0,0)(0,0,1)[16] with zero mean     : 5913.662
##  ARIMA(1,0,0)(0,0,1)[16] with non-zero mean : 5898.296
##  ARIMA(1,0,0)(0,0,2)[16] with zero mean     : 5915.174
##  ARIMA(1,0,0)(0,0,2)[16] with non-zero mean : 5900.264
##  ARIMA(1,0,0)(1,0,0)[16] with zero mean     : 5912.762
##  ARIMA(1,0,0)(1,0,0)[16] with non-zero mean : 5898.366
##  ARIMA(1,0,0)(1,0,1)[16] with zero mean     : 5914.729
##  ARIMA(1,0,0)(1,0,1)[16] with non-zero mean : 5900.247
##  ARIMA(1,0,0)(1,0,2)[16] with zero mean     : 5915.576
##  ARIMA(1,0,0)(1,0,2)[16] with non-zero mean : Inf
##  ARIMA(1,0,0)(2,0,0)[16] with zero mean     : 5914.766
##  ARIMA(1,0,0)(2,0,0)[16] with non-zero mean : 5900.282
##  ARIMA(1,0,0)(2,0,1)[16] with zero mean     : Inf
##  ARIMA(1,0,0)(2,0,1)[16] with non-zero mean : Inf
##  ARIMA(1,0,0)(2,0,2)[16] with zero mean     : Inf
##  ARIMA(1,0,0)(2,0,2)[16] with non-zero mean : 5904.293
##  ARIMA(1,0,1)            with zero mean     : 5929.506
##  ARIMA(1,0,1)            with non-zero mean : 5909.434
##  ARIMA(1,0,1)(0,0,1)[16] with zero mean     : 5914.708
##  ARIMA(1,0,1)(0,0,1)[16] with non-zero mean : 5897.18
##  ARIMA(1,0,1)(0,0,2)[16] with zero mean     : 5915.989
##  ARIMA(1,0,1)(0,0,2)[16] with non-zero mean : 5899.016
##  ARIMA(1,0,1)(1,0,0)[16] with zero mean     : 5913.493
##  ARIMA(1,0,1)(1,0,0)[16] with non-zero mean : 5896.933
##  ARIMA(1,0,1)(1,0,1)[16] with zero mean     : 5915.217
##  ARIMA(1,0,1)(1,0,1)[16] with non-zero mean : 5898.97
##  ARIMA(1,0,1)(1,0,2)[16] with zero mean     : 5915.94
##  ARIMA(1,0,1)(1,0,2)[16] with non-zero mean : 5900.99
##  ARIMA(1,0,1)(2,0,0)[16] with zero mean     : 5915.382
##  ARIMA(1,0,1)(2,0,0)[16] with non-zero mean : 5898.974
##  ARIMA(1,0,1)(2,0,1)[16] with zero mean     : Inf
##  ARIMA(1,0,1)(2,0,1)[16] with non-zero mean : 5901.041
##  ARIMA(1,0,2)            with zero mean     : 5926.647
##  ARIMA(1,0,2)            with non-zero mean : 5903.617
##  ARIMA(1,0,2)(0,0,1)[16] with zero mean     : 5912.013
##  ARIMA(1,0,2)(0,0,1)[16] with non-zero mean : 5892.174
##  ARIMA(1,0,2)(0,0,2)[16] with zero mean     : 5913.573
##  ARIMA(1,0,2)(0,0,2)[16] with non-zero mean : 5894.22
##  ARIMA(1,0,2)(1,0,0)[16] with zero mean     : 5910.984
##  ARIMA(1,0,2)(1,0,0)[16] with non-zero mean : 5892.276
##  ARIMA(1,0,2)(1,0,1)[16] with zero mean     : 5912.652
##  ARIMA(1,0,2)(1,0,1)[16] with non-zero mean : 5894.206
##  ARIMA(1,0,2)(2,0,0)[16] with zero mean     : 5912.904
##  ARIMA(1,0,2)(2,0,0)[16] with non-zero mean : 5894.253
##  ARIMA(1,0,3)            with zero mean     : 5911.226
##  ARIMA(1,0,3)            with non-zero mean : 5877.901
##  ARIMA(1,0,3)(0,0,1)[16] with zero mean     : 5896.345
##  ARIMA(1,0,3)(0,0,1)[16] with non-zero mean : 5868.702
##  ARIMA(1,0,3)(1,0,0)[16] with zero mean     : 5893.882
##  ARIMA(1,0,3)(1,0,0)[16] with non-zero mean : 5868.454
##  ARIMA(1,0,4)            with zero mean     : Inf
##  ARIMA(1,0,4)            with non-zero mean : 5879.617
##  ARIMA(2,0,0)            with zero mean     : 5929.216
##  ARIMA(2,0,0)            with non-zero mean : 5907.817
##  ARIMA(2,0,0)(0,0,1)[16] with zero mean     : 5914.481
##  ARIMA(2,0,0)(0,0,1)[16] with non-zero mean : 5895.919
##  ARIMA(2,0,0)(0,0,2)[16] with zero mean     : 5915.699
##  ARIMA(2,0,0)(0,0,2)[16] with non-zero mean : 5897.689
##  ARIMA(2,0,0)(1,0,0)[16] with zero mean     : 5913.188
##  ARIMA(2,0,0)(1,0,0)[16] with non-zero mean : 5895.566
##  ARIMA(2,0,0)(1,0,1)[16] with zero mean     : 5914.807
##  ARIMA(2,0,0)(1,0,1)[16] with non-zero mean : 5897.626
##  ARIMA(2,0,0)(1,0,2)[16] with zero mean     : 5915.466
##  ARIMA(2,0,0)(1,0,2)[16] with non-zero mean : 5899.655
##  ARIMA(2,0,0)(2,0,0)[16] with zero mean     : 5914.183
##  ARIMA(2,0,0)(2,0,0)[16] with non-zero mean : 5897.627
##  ARIMA(2,0,0)(2,0,1)[16] with zero mean     : Inf
##  ARIMA(2,0,0)(2,0,1)[16] with non-zero mean : 5899.699
##  ARIMA(2,0,1)            with zero mean     : 5930.771
##  ARIMA(2,0,1)            with non-zero mean : 5902.491
##  ARIMA(2,0,1)(0,0,1)[16] with zero mean     : 5915.467
##  ARIMA(2,0,1)(0,0,1)[16] with non-zero mean : 5890.756
##  ARIMA(2,0,1)(0,0,2)[16] with zero mean     : 5916.721
##  ARIMA(2,0,1)(0,0,2)[16] with non-zero mean : 5892.439
##  ARIMA(2,0,1)(1,0,0)[16] with zero mean     : 5914.155
##  ARIMA(2,0,1)(1,0,0)[16] with non-zero mean : 5890.223
##  ARIMA(2,0,1)(1,0,1)[16] with zero mean     : 5914.858
##  ARIMA(2,0,1)(1,0,1)[16] with non-zero mean : 5892.293
##  ARIMA(2,0,1)(2,0,0)[16] with zero mean     : 5916.06
##  ARIMA(2,0,1)(2,0,0)[16] with non-zero mean : 5892.295
##  ARIMA(2,0,2)            with zero mean     : 5925.483
##  ARIMA(2,0,2)            with non-zero mean : 5891.825
##  ARIMA(2,0,2)(0,0,1)[16] with zero mean     : 5910.169
##  ARIMA(2,0,2)(0,0,1)[16] with non-zero mean : 5880.613
##  ARIMA(2,0,2)(1,0,0)[16] with zero mean     : 5908.505
##  ARIMA(2,0,2)(1,0,0)[16] with non-zero mean : 5880.4
##  ARIMA(2,0,3)            with zero mean     : 5911.948
##  ARIMA(2,0,3)            with non-zero mean : 5879.561
##  ARIMA(3,0,0)            with zero mean     : 5928.15
##  ARIMA(3,0,0)            with non-zero mean : 5900.006
##  ARIMA(3,0,0)(0,0,1)[16] with zero mean     : 5913.108
##  ARIMA(3,0,0)(0,0,1)[16] with non-zero mean : 5888.623
##  ARIMA(3,0,0)(0,0,2)[16] with zero mean     : 5914.335
##  ARIMA(3,0,0)(0,0,2)[16] with non-zero mean : 5890.524
##  ARIMA(3,0,0)(1,0,0)[16] with zero mean     : 5911.652
##  ARIMA(3,0,0)(1,0,0)[16] with non-zero mean : 5888.37
##  ARIMA(3,0,0)(1,0,1)[16] with zero mean     : 5912.781
##  ARIMA(3,0,0)(1,0,1)[16] with non-zero mean : 5890.44
##  ARIMA(3,0,0)(2,0,0)[16] with zero mean     : 5913.366
##  ARIMA(3,0,0)(2,0,0)[16] with non-zero mean : 5890.443
##  ARIMA(3,0,1)            with zero mean     : 5930.061
##  ARIMA(3,0,1)            with non-zero mean : 5899.854
##  ARIMA(3,0,1)(0,0,1)[16] with zero mean     : 5914.874
##  ARIMA(3,0,1)(0,0,1)[16] with non-zero mean : 5888.16
##  ARIMA(3,0,1)(1,0,0)[16] with zero mean     : 5913.344
##  ARIMA(3,0,1)(1,0,0)[16] with non-zero mean : 5887.858
##  ARIMA(3,0,2)            with zero mean     : 5933.983
##  ARIMA(3,0,2)            with non-zero mean : 5887.709
##  ARIMA(4,0,0)            with zero mean     : 5929.24
##  ARIMA(4,0,0)            with non-zero mean : 5894.763
##  ARIMA(4,0,0)(0,0,1)[16] with zero mean     : 5913.391
##  ARIMA(4,0,0)(0,0,1)[16] with non-zero mean : 5882.648
##  ARIMA(4,0,0)(1,0,0)[16] with zero mean     : 5911.555
##  ARIMA(4,0,0)(1,0,0)[16] with non-zero mean : 5882.25
##  ARIMA(4,0,1)            with zero mean     : 5925.197
##  ARIMA(4,0,1)            with non-zero mean : 5891.308
##  ARIMA(5,0,0)            with zero mean     : 5906.657
##  ARIMA(5,0,0)            with non-zero mean : 5884.985
## 
## 
## 
##  Best model: ARIMA(0,0,4)(1,0,0)[16] with non-zero mean 
## 
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6411.073
##  ARIMA(0,0,0) with non-zero mean : 6236.361
##  ARIMA(0,0,1) with zero mean     : 6142.057
##  ARIMA(0,0,1) with non-zero mean : 6024.666
##  ARIMA(0,0,2) with zero mean     : 6059.217
##  ARIMA(0,0,2) with non-zero mean : 5972.392
##  ARIMA(0,0,3) with zero mean     : 5957.704
##  ARIMA(0,0,3) with non-zero mean : 5899.022
##  ARIMA(0,0,4) with zero mean     : 5936.713
##  ARIMA(0,0,4) with non-zero mean : 5892.644
##  ARIMA(0,0,5) with zero mean     : 5933.312
##  ARIMA(0,0,5) with non-zero mean : 5894.52
##  ARIMA(1,0,0) with zero mean     : 5943.923
##  ARIMA(1,0,0) with non-zero mean : 5926.474
##  ARIMA(1,0,1) with zero mean     : 5944.579
##  ARIMA(1,0,1) with non-zero mean : 5924.436
##  ARIMA(1,0,2) with zero mean     : 5941.704
##  ARIMA(1,0,2) with non-zero mean : 5918.603
##  ARIMA(1,0,3) with zero mean     : 5926.236
##  ARIMA(1,0,3) with non-zero mean : 5892.825
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 5894.542
##  ARIMA(2,0,0) with zero mean     : 5944.289
##  ARIMA(2,0,0) with non-zero mean : 5922.817
##  ARIMA(2,0,1) with zero mean     : 5945.842
##  ARIMA(2,0,1) with non-zero mean : 5917.489
##  ARIMA(2,0,2) with zero mean     : 5940.532
##  ARIMA(2,0,2) with non-zero mean : 5906.798
##  ARIMA(2,0,3) with zero mean     : 5926.952
##  ARIMA(2,0,3) with non-zero mean : 5894.486
##  ARIMA(3,0,0) with zero mean     : 5943.214
##  ARIMA(3,0,0) with non-zero mean : 5914.994
##  ARIMA(3,0,1) with zero mean     : 5945.125
##  ARIMA(3,0,1) with non-zero mean : 5914.844
##  ARIMA(3,0,2) with zero mean     : 5949.048
##  ARIMA(3,0,2) with non-zero mean : 5902.651
##  ARIMA(4,0,0) with zero mean     : 5944.301
##  ARIMA(4,0,0) with non-zero mean : 5909.75
##  ARIMA(4,0,1) with zero mean     : 5940.242
##  ARIMA(4,0,1) with non-zero mean : 5906.275
##  ARIMA(5,0,0) with zero mean     : 5921.646
##  ARIMA(5,0,0) with non-zero mean : 5899.919
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6427.441
##  ARIMA(0,0,0) with non-zero mean : 6252.459
##  ARIMA(0,0,1) with zero mean     : 6157.663
##  ARIMA(0,0,1) with non-zero mean : 6040.13
##  ARIMA(0,0,2) with zero mean     : 6074.59
##  ARIMA(0,0,2) with non-zero mean : 5987.647
##  ARIMA(0,0,3) with zero mean     : 5972.797
##  ARIMA(0,0,3) with non-zero mean : 5914.014
##  ARIMA(0,0,4) with zero mean     : 5951.733
##  ARIMA(0,0,4) with non-zero mean : 5907.603
##  ARIMA(0,0,5) with zero mean     : 5948.315
##  ARIMA(0,0,5) with non-zero mean : 5909.476
##  ARIMA(1,0,0) with zero mean     : 5958.974
##  ARIMA(1,0,0) with non-zero mean : 5941.511
##  ARIMA(1,0,1) with zero mean     : 5959.626
##  ARIMA(1,0,1) with non-zero mean : 5939.472
##  ARIMA(1,0,2) with zero mean     : 5956.74
##  ARIMA(1,0,2) with non-zero mean : 5933.614
##  ARIMA(1,0,3) with zero mean     : 5941.224
##  ARIMA(1,0,3) with non-zero mean : 5907.777
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 5909.497
##  ARIMA(2,0,0) with zero mean     : 5959.336
##  ARIMA(2,0,0) with non-zero mean : 5937.855
##  ARIMA(2,0,1) with zero mean     : 5960.886
##  ARIMA(2,0,1) with non-zero mean : 5932.536
##  ARIMA(2,0,2) with zero mean     : 5955.56
##  ARIMA(2,0,2) with non-zero mean : 5921.801
##  ARIMA(2,0,3) with zero mean     : 5941.936
##  ARIMA(2,0,3) with non-zero mean : 5909.443
##  ARIMA(3,0,0) with zero mean     : 5958.254
##  ARIMA(3,0,0) with non-zero mean : 5930.021
##  ARIMA(3,0,1) with zero mean     : 5960.163
##  ARIMA(3,0,1) with non-zero mean : 5929.876
##  ARIMA(3,0,2) with zero mean     : 5964.043
##  ARIMA(3,0,2) with non-zero mean : 5917.618
##  ARIMA(4,0,0) with zero mean     : 5959.338
##  ARIMA(4,0,0) with non-zero mean : 5924.785
##  ARIMA(4,0,1) with zero mean     : 5955.262
##  ARIMA(4,0,1) with non-zero mean : 5921.289
##  ARIMA(5,0,0) with zero mean     : 5936.613
##  ARIMA(5,0,0) with non-zero mean : 5914.888
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6443.865
##  ARIMA(0,0,0) with non-zero mean : 6268.435
##  ARIMA(0,0,1) with zero mean     : 6173.382
##  ARIMA(0,0,1) with non-zero mean : 6055.43
##  ARIMA(0,0,2) with zero mean     : 6090.049
##  ARIMA(0,0,2) with non-zero mean : 6002.788
##  ARIMA(0,0,3) with zero mean     : 5987.993
##  ARIMA(0,0,3) with non-zero mean : 5928.924
##  ARIMA(0,0,4) with zero mean     : 5966.857
##  ARIMA(0,0,4) with non-zero mean : 5922.489
##  ARIMA(0,0,5) with zero mean     : 5963.413
##  ARIMA(0,0,5) with non-zero mean : 5924.36
##  ARIMA(1,0,0) with zero mean     : 5974.091
##  ARIMA(1,0,0) with non-zero mean : 5956.506
##  ARIMA(1,0,1) with zero mean     : 5974.742
##  ARIMA(1,0,1) with non-zero mean : 5954.456
##  ARIMA(1,0,2) with zero mean     : 5971.838
##  ARIMA(1,0,2) with non-zero mean : 5948.576
##  ARIMA(1,0,3) with zero mean     : 5956.299
##  ARIMA(1,0,3) with non-zero mean : 5922.662
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 5924.381
##  ARIMA(2,0,0) with zero mean     : 5974.451
##  ARIMA(2,0,0) with non-zero mean : 5952.834
##  ARIMA(2,0,1) with zero mean     : 5976.002
##  ARIMA(2,0,1) with non-zero mean : 5947.503
##  ARIMA(2,0,2) with zero mean     : 5970.652
##  ARIMA(2,0,2) with non-zero mean : 5936.729
##  ARIMA(2,0,3) with zero mean     : 5957.005
##  ARIMA(2,0,3) with non-zero mean : 5924.327
##  ARIMA(3,0,0) with zero mean     : 5973.359
##  ARIMA(3,0,0) with non-zero mean : 5944.976
##  ARIMA(3,0,1) with zero mean     : 5975.269
##  ARIMA(3,0,1) with non-zero mean : 5944.828
##  ARIMA(3,0,2) with zero mean     : 5979.166
##  ARIMA(3,0,2) with non-zero mean : 5932.525
##  ARIMA(4,0,0) with zero mean     : 5974.444
##  ARIMA(4,0,0) with non-zero mean : 5939.724
##  ARIMA(4,0,1) with zero mean     : 5970.356
##  ARIMA(4,0,1) with non-zero mean : 5936.211
##  ARIMA(5,0,0) with zero mean     : 5951.655
##  ARIMA(5,0,0) with non-zero mean : 5929.788
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6460.525
##  ARIMA(0,0,0) with non-zero mean : 6284.275
##  ARIMA(0,0,1) with zero mean     : 6189.3
##  ARIMA(0,0,1) with non-zero mean : 6070.695
##  ARIMA(0,0,2) with zero mean     : 6105.797
##  ARIMA(0,0,2) with non-zero mean : 6017.938
##  ARIMA(0,0,3) with zero mean     : 6003.444
##  ARIMA(0,0,3) with non-zero mean : 5943.901
##  ARIMA(0,0,4) with zero mean     : 5982.251
##  ARIMA(0,0,4) with non-zero mean : 5937.465
##  ARIMA(0,0,5) with zero mean     : 5978.782
##  ARIMA(0,0,5) with non-zero mean : 5939.338
##  ARIMA(1,0,0) with zero mean     : 5989.48
##  ARIMA(1,0,0) with non-zero mean : 5971.634
##  ARIMA(1,0,1) with zero mean     : 5990.12
##  ARIMA(1,0,1) with non-zero mean : 5969.553
##  ARIMA(1,0,2) with zero mean     : 5987.224
##  ARIMA(1,0,2) with non-zero mean : 5963.659
##  ARIMA(1,0,3) with zero mean     : 5971.637
##  ARIMA(1,0,3) with non-zero mean : 5937.644
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 5939.375
##  ARIMA(2,0,0) with zero mean     : 5989.827
##  ARIMA(2,0,0) with non-zero mean : 5967.918
##  ARIMA(2,0,1) with zero mean     : 5991.371
##  ARIMA(2,0,1) with non-zero mean : 5962.536
##  ARIMA(2,0,2) with zero mean     : 5986.037
##  ARIMA(2,0,2) with non-zero mean : 5951.741
##  ARIMA(2,0,3) with zero mean     : 5972.331
##  ARIMA(2,0,3) with non-zero mean : 5939.304
##  ARIMA(3,0,0) with zero mean     : 5988.74
##  ARIMA(3,0,0) with non-zero mean : 5960.018
##  ARIMA(3,0,1) with zero mean     : 5990.65
##  ARIMA(3,0,1) with non-zero mean : 5959.852
##  ARIMA(3,0,2) with zero mean     : 5994.578
##  ARIMA(3,0,2) with non-zero mean : 5947.546
##  ARIMA(4,0,0) with zero mean     : 5989.823
##  ARIMA(4,0,0) with non-zero mean : 5954.721
##  ARIMA(4,0,1) with zero mean     : 5985.709
##  ARIMA(4,0,1) with non-zero mean : 5951.193
##  ARIMA(5,0,0) with zero mean     : 5966.948
##  ARIMA(5,0,0) with non-zero mean : 5944.777
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6477.126
##  ARIMA(0,0,0) with non-zero mean : 6300.121
##  ARIMA(0,0,1) with zero mean     : 6205.004
##  ARIMA(0,0,1) with non-zero mean : 6085.987
##  ARIMA(0,0,2) with zero mean     : 6121.194
##  ARIMA(0,0,2) with non-zero mean : 6033.099
##  ARIMA(0,0,3) with zero mean     : 6018.542
##  ARIMA(0,0,3) with non-zero mean : 5958.844
##  ARIMA(0,0,4) with zero mean     : 5997.263
##  ARIMA(0,0,4) with non-zero mean : 5952.394
##  ARIMA(0,0,5) with zero mean     : 5993.777
##  ARIMA(0,0,5) with non-zero mean : 5954.265
##  ARIMA(1,0,0) with zero mean     : 6004.527
##  ARIMA(1,0,0) with non-zero mean : 5986.636
##  ARIMA(1,0,1) with zero mean     : 6005.161
##  ARIMA(1,0,1) with non-zero mean : 5984.556
##  ARIMA(1,0,2) with zero mean     : 6002.251
##  ARIMA(1,0,2) with non-zero mean : 5978.644
##  ARIMA(1,0,3) with zero mean     : 5986.615
##  ARIMA(1,0,3) with non-zero mean : 5952.568
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 5954.287
##  ARIMA(2,0,0) with zero mean     : 6004.868
##  ARIMA(2,0,0) with non-zero mean : 5982.924
##  ARIMA(2,0,1) with zero mean     : 6006.412
##  ARIMA(2,0,1) with non-zero mean : 5977.545
##  ARIMA(2,0,2) with zero mean     : 6001.055
##  ARIMA(2,0,2) with non-zero mean : 5966.715
##  ARIMA(2,0,3) with zero mean     : 5987.305
##  ARIMA(2,0,3) with non-zero mean : 5954.232
##  ARIMA(3,0,0) with zero mean     : 6003.771
##  ARIMA(3,0,0) with non-zero mean : 5975.018
##  ARIMA(3,0,1) with zero mean     : 6005.681
##  ARIMA(3,0,1) with non-zero mean : 5974.852
##  ARIMA(3,0,2) with zero mean     : 6009.584
##  ARIMA(3,0,2) with non-zero mean : 5962.49
##  ARIMA(4,0,0) with zero mean     : 6004.851
##  ARIMA(4,0,0) with non-zero mean : 5969.711
##  ARIMA(4,0,1) with zero mean     : 6000.722
##  ARIMA(4,0,1) with non-zero mean : 5966.162
##  ARIMA(5,0,0) with zero mean     : 5981.909
##  ARIMA(5,0,0) with non-zero mean : 5959.712
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6493.708
##  ARIMA(0,0,0) with non-zero mean : 6315.97
##  ARIMA(0,0,1) with zero mean     : 6220.824
##  ARIMA(0,0,1) with non-zero mean : 6101.242
##  ARIMA(0,0,2) with zero mean     : 6136.698
##  ARIMA(0,0,2) with non-zero mean : 6048.213
##  ARIMA(0,0,3) with zero mean     : 6033.648
##  ARIMA(0,0,3) with non-zero mean : 5973.783
##  ARIMA(0,0,4) with zero mean     : 6012.288
##  ARIMA(0,0,4) with non-zero mean : 5967.309
##  ARIMA(0,0,5) with zero mean     : 6008.774
##  ARIMA(0,0,5) with non-zero mean : 5969.181
##  ARIMA(1,0,0) with zero mean     : 6019.581
##  ARIMA(1,0,0) with non-zero mean : 6001.628
##  ARIMA(1,0,1) with zero mean     : 6020.215
##  ARIMA(1,0,1) with non-zero mean : 5999.536
##  ARIMA(1,0,2) with zero mean     : 6017.279
##  ARIMA(1,0,2) with non-zero mean : 5993.619
##  ARIMA(1,0,3) with zero mean     : 6001.594
##  ARIMA(1,0,3) with non-zero mean : 5967.485
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 5969.202
##  ARIMA(2,0,0) with zero mean     : 6019.921
##  ARIMA(2,0,0) with non-zero mean : 5997.9
##  ARIMA(2,0,1) with zero mean     : 6021.464
##  ARIMA(2,0,1) with non-zero mean : 5992.514
##  ARIMA(2,0,2) with zero mean     : 6016.071
##  ARIMA(2,0,2) with non-zero mean : 5981.685
##  ARIMA(2,0,3) with zero mean     : 6002.278
##  ARIMA(2,0,3) with non-zero mean : 5969.148
##  ARIMA(3,0,0) with zero mean     : 6018.809
##  ARIMA(3,0,0) with non-zero mean : 5989.986
##  ARIMA(3,0,1) with zero mean     : 6020.718
##  ARIMA(3,0,1) with non-zero mean : 5989.823
##  ARIMA(3,0,2) with zero mean     : 6024.638
##  ARIMA(3,0,2) with non-zero mean : 5977.429
##  ARIMA(4,0,0) with zero mean     : 6019.886
##  ARIMA(4,0,0) with non-zero mean : 5984.677
##  ARIMA(4,0,1) with zero mean     : 6015.737
##  ARIMA(4,0,1) with non-zero mean : 5981.114
##  ARIMA(5,0,0) with zero mean     : 5996.869
##  ARIMA(5,0,0) with non-zero mean : 5974.636
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6510.13
##  ARIMA(0,0,0) with non-zero mean : 6331.928
##  ARIMA(0,0,1) with zero mean     : 6236.414
##  ARIMA(0,0,1) with non-zero mean : 6116.696
##  ARIMA(0,0,2) with zero mean     : 6152.059
##  ARIMA(0,0,2) with non-zero mean : 6063.454
##  ARIMA(0,0,3) with zero mean     : 6048.71
##  ARIMA(0,0,3) with non-zero mean : 5988.861
##  ARIMA(0,0,4) with zero mean     : 6027.292
##  ARIMA(0,0,4) with non-zero mean : 5982.374
##  ARIMA(0,0,5) with zero mean     : 6023.767
##  ARIMA(0,0,5) with non-zero mean : 5984.244
##  ARIMA(1,0,0) with zero mean     : 6034.656
##  ARIMA(1,0,0) with non-zero mean : 6016.774
##  ARIMA(1,0,1) with zero mean     : 6035.284
##  ARIMA(1,0,1) with non-zero mean : 6014.674
##  ARIMA(1,0,2) with zero mean     : 6032.32
##  ARIMA(1,0,2) with non-zero mean : 6008.712
##  ARIMA(1,0,3) with zero mean     : 6016.585
##  ARIMA(1,0,3) with non-zero mean : 5982.549
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 5984.266
##  ARIMA(2,0,0) with zero mean     : 6034.988
##  ARIMA(2,0,0) with non-zero mean : 6013.033
##  ARIMA(2,0,1) with zero mean     : 6036.536
##  ARIMA(2,0,1) with non-zero mean : 6007.66
##  ARIMA(2,0,2) with zero mean     : 6031.1
##  ARIMA(2,0,2) with non-zero mean : 5996.766
##  ARIMA(2,0,3) with zero mean     : 6017.27
##  ARIMA(2,0,3) with non-zero mean : 5984.212
##  ARIMA(3,0,0) with zero mean     : 6033.859
##  ARIMA(3,0,0) with non-zero mean : 6005.092
##  ARIMA(3,0,1) with zero mean     : 6035.767
##  ARIMA(3,0,1) with non-zero mean : 6004.945
##  ARIMA(3,0,2) with zero mean     : 6039.674
##  ARIMA(3,0,2) with non-zero mean : 5992.479
##  ARIMA(4,0,0) with zero mean     : 6034.938
##  ARIMA(4,0,0) with non-zero mean : 5999.836
##  ARIMA(4,0,1) with zero mean     : 6030.776
##  ARIMA(4,0,1) with non-zero mean : 5996.248
##  ARIMA(5,0,0) with zero mean     : 6011.85
##  ARIMA(5,0,0) with non-zero mean : 5989.709
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6526.45
##  ARIMA(0,0,0) with non-zero mean : 6348.171
##  ARIMA(0,0,1) with zero mean     : 6251.991
##  ARIMA(0,0,1) with non-zero mean : 6132.262
##  ARIMA(0,0,2) with zero mean     : 6167.41
##  ARIMA(0,0,2) with non-zero mean : 6078.952
##  ARIMA(0,0,3) with zero mean     : 6063.771
##  ARIMA(0,0,3) with non-zero mean : 6003.996
##  ARIMA(0,0,4) with zero mean     : 6042.292
##  ARIMA(0,0,4) with non-zero mean : 5997.459
##  ARIMA(0,0,5) with zero mean     : 6038.756
##  ARIMA(0,0,5) with non-zero mean : 5999.328
##  ARIMA(1,0,0) with zero mean     : 6049.799
##  ARIMA(1,0,0) with non-zero mean : 6032.081
##  ARIMA(1,0,1) with zero mean     : 6050.408
##  ARIMA(1,0,1) with non-zero mean : 6029.95
##  ARIMA(1,0,2) with zero mean     : 6047.422
##  ARIMA(1,0,2) with non-zero mean : 6023.978
##  ARIMA(1,0,3) with zero mean     : 6031.571
##  ARIMA(1,0,3) with non-zero mean : 5997.635
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 5999.352
##  ARIMA(2,0,0) with zero mean     : 6050.107
##  ARIMA(2,0,0) with non-zero mean : 6028.299
##  ARIMA(2,0,1) with zero mean     : 6051.657
##  ARIMA(2,0,1) with non-zero mean : 6022.938
##  ARIMA(2,0,2) with zero mean     : 6046.168
##  ARIMA(2,0,2) with non-zero mean : 6011.962
##  ARIMA(2,0,3) with zero mean     : 6032.258
##  ARIMA(2,0,3) with non-zero mean : 5999.295
##  ARIMA(3,0,0) with zero mean     : 6048.962
##  ARIMA(3,0,0) with non-zero mean : 6020.35
##  ARIMA(3,0,1) with zero mean     : 6050.869
##  ARIMA(3,0,1) with non-zero mean : 6020.21
##  ARIMA(3,0,2) with zero mean     : 6054.801
##  ARIMA(3,0,2) with non-zero mean : 6007.615
##  ARIMA(4,0,0) with zero mean     : 6050.032
##  ARIMA(4,0,0) with non-zero mean : 6015.083
##  ARIMA(4,0,1) with zero mean     : 6045.838
##  ARIMA(4,0,1) with non-zero mean : 6011.443
##  ARIMA(5,0,0) with zero mean     : 6026.84
##  ARIMA(5,0,0) with non-zero mean : 6004.816
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6542.781
##  ARIMA(0,0,0) with non-zero mean : 6364.329
##  ARIMA(0,0,1) with zero mean     : 6267.595
##  ARIMA(0,0,1) with non-zero mean : 6147.668
##  ARIMA(0,0,2) with zero mean     : 6182.832
##  ARIMA(0,0,2) with non-zero mean : 6094.112
##  ARIMA(0,0,3) with zero mean     : 6078.877
##  ARIMA(0,0,3) with non-zero mean : 6018.926
##  ARIMA(0,0,4) with zero mean     : 6057.375
##  ARIMA(0,0,4) with non-zero mean : 6012.338
##  ARIMA(0,0,5) with zero mean     : 6053.829
##  ARIMA(0,0,5) with non-zero mean : 6014.205
##  ARIMA(1,0,0) with zero mean     : 6064.848
##  ARIMA(1,0,0) with non-zero mean : 6047.08
##  ARIMA(1,0,1) with zero mean     : 6065.46
##  ARIMA(1,0,1) with non-zero mean : 6044.934
##  ARIMA(1,0,2) with zero mean     : 6062.476
##  ARIMA(1,0,2) with non-zero mean : 6038.936
##  ARIMA(1,0,3) with zero mean     : 6046.62
##  ARIMA(1,0,3) with non-zero mean : 6012.513
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 6014.226
##  ARIMA(2,0,0) with zero mean     : 6065.161
##  ARIMA(2,0,0) with non-zero mean : 6043.277
##  ARIMA(2,0,1) with zero mean     : 6066.7
##  ARIMA(2,0,1) with non-zero mean : 6037.888
##  ARIMA(2,0,2) with zero mean     : 6061.234
##  ARIMA(2,0,2) with non-zero mean : 6026.877
##  ARIMA(2,0,3) with zero mean     : 6047.286
##  ARIMA(2,0,3) with non-zero mean : 6014.172
##  ARIMA(3,0,0) with zero mean     : 6064.021
##  ARIMA(3,0,0) with non-zero mean : 6035.296
##  ARIMA(3,0,1) with zero mean     : 6065.928
##  ARIMA(3,0,1) with non-zero mean : 6035.151
##  ARIMA(3,0,2) with zero mean     : 6069.845
##  ARIMA(3,0,2) with non-zero mean : 6022.51
##  ARIMA(4,0,0) with zero mean     : 6065.091
##  ARIMA(4,0,0) with non-zero mean : 6030.014
##  ARIMA(4,0,1) with zero mean     : 6060.875
##  ARIMA(4,0,1) with non-zero mean : 6026.361
##  ARIMA(5,0,0) with zero mean     : 6041.812
##  ARIMA(5,0,0) with non-zero mean : 6019.71
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6559.094
##  ARIMA(0,0,0) with non-zero mean : 6380.584
##  ARIMA(0,0,1) with zero mean     : 6283.162
##  ARIMA(0,0,1) with non-zero mean : 6163.289
##  ARIMA(0,0,2) with zero mean     : 6198.17
##  ARIMA(0,0,2) with non-zero mean : 6109.503
##  ARIMA(0,0,3) with zero mean     : 6093.933
##  ARIMA(0,0,3) with non-zero mean : 6033.971
##  ARIMA(0,0,4) with zero mean     : 6072.368
##  ARIMA(0,0,4) with non-zero mean : 6027.342
##  ARIMA(0,0,5) with zero mean     : 6068.806
##  ARIMA(0,0,5) with non-zero mean : 6029.196
##  ARIMA(1,0,0) with zero mean     : 6079.883
##  ARIMA(1,0,0) with non-zero mean : 6062.171
##  ARIMA(1,0,1) with zero mean     : 6080.491
##  ARIMA(1,0,1) with non-zero mean : 6060.036
##  ARIMA(1,0,2) with zero mean     : 6077.488
##  ARIMA(1,0,2) with non-zero mean : 6053.981
##  ARIMA(1,0,3) with zero mean     : 6061.583
##  ARIMA(1,0,3) with non-zero mean : 6027.496
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 6029.218
##  ARIMA(2,0,0) with zero mean     : 6080.191
##  ARIMA(2,0,0) with non-zero mean : 6058.382
##  ARIMA(2,0,1) with zero mean     : 6081.729
##  ARIMA(2,0,1) with non-zero mean : 6052.977
##  ARIMA(2,0,2) with zero mean     : 6076.237
##  ARIMA(2,0,2) with non-zero mean : 6041.887
##  ARIMA(2,0,3) with zero mean     : 6062.246
##  ARIMA(2,0,3) with non-zero mean : 6029.165
##  ARIMA(3,0,0) with zero mean     : 6079.038
##  ARIMA(3,0,0) with non-zero mean : 6050.357
##  ARIMA(3,0,1) with zero mean     : 6080.944
##  ARIMA(3,0,1) with non-zero mean : 6050.203
##  ARIMA(3,0,2) with zero mean     : 6085.093
##  ARIMA(3,0,2) with non-zero mean : 6037.523
##  ARIMA(4,0,0) with zero mean     : 6080.105
##  ARIMA(4,0,0) with non-zero mean : 6045.045
##  ARIMA(4,0,1) with zero mean     : 6075.871
##  ARIMA(4,0,1) with non-zero mean : 6041.364
##  ARIMA(5,0,0) with zero mean     : 6056.76
##  ARIMA(5,0,0) with non-zero mean : 6034.691
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6575.409
##  ARIMA(0,0,0) with non-zero mean : 6396.792
##  ARIMA(0,0,1) with zero mean     : 6298.758
##  ARIMA(0,0,1) with non-zero mean : 6178.712
##  ARIMA(0,0,2) with zero mean     : 6213.517
##  ARIMA(0,0,2) with non-zero mean : 6124.772
##  ARIMA(0,0,3) with zero mean     : 6108.991
##  ARIMA(0,0,3) with non-zero mean : 6048.978
##  ARIMA(0,0,4) with zero mean     : 6087.362
##  ARIMA(0,0,4) with non-zero mean : 6042.298
##  ARIMA(0,0,5) with zero mean     : 6083.782
##  ARIMA(0,0,5) with non-zero mean : 6044.148
##  ARIMA(1,0,0) with zero mean     : 6094.915
##  ARIMA(1,0,0) with non-zero mean : 6077.183
##  ARIMA(1,0,1) with zero mean     : 6095.522
##  ARIMA(1,0,1) with non-zero mean : 6075.039
##  ARIMA(1,0,2) with zero mean     : 6092.5
##  ARIMA(1,0,2) with non-zero mean : 6068.988
##  ARIMA(1,0,3) with zero mean     : 6076.547
##  ARIMA(1,0,3) with non-zero mean : 6042.444
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 6044.17
##  ARIMA(2,0,0) with zero mean     : 6095.221
##  ARIMA(2,0,0) with non-zero mean : 6073.385
##  ARIMA(2,0,1) with zero mean     : 6096.766
##  ARIMA(2,0,1) with non-zero mean : 6067.976
##  ARIMA(2,0,2) with zero mean     : 6091.241
##  ARIMA(2,0,2) with non-zero mean : 6056.901
##  ARIMA(2,0,3) with zero mean     : 6077.207
##  ARIMA(2,0,3) with non-zero mean : 6044.12
##  ARIMA(3,0,0) with zero mean     : 6094.059
##  ARIMA(3,0,0) with non-zero mean : 6065.363
##  ARIMA(3,0,1) with zero mean     : 6095.965
##  ARIMA(3,0,1) with non-zero mean : 6065.203
##  ARIMA(3,0,2) with zero mean     : 6099.879
##  ARIMA(3,0,2) with non-zero mean : 6052.519
##  ARIMA(4,0,0) with zero mean     : 6095.127
##  ARIMA(4,0,0) with non-zero mean : 6060.024
##  ARIMA(4,0,1) with zero mean     : 6090.876
##  ARIMA(4,0,1) with non-zero mean : 6056.334
##  ARIMA(5,0,0) with zero mean     : 6071.702
##  ARIMA(5,0,0) with non-zero mean : 6049.647
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6591.854
##  ARIMA(0,0,0) with non-zero mean : 6412.7
##  ARIMA(0,0,1) with zero mean     : 6314.502
##  ARIMA(0,0,1) with non-zero mean : 6193.966
##  ARIMA(0,0,2) with zero mean     : 6229.147
##  ARIMA(0,0,2) with non-zero mean : 6139.871
##  ARIMA(0,0,3) with zero mean     : 6124.307
##  ARIMA(0,0,3) with non-zero mean : 6063.881
##  ARIMA(0,0,4) with zero mean     : 6102.601
##  ARIMA(0,0,4) with non-zero mean : 6057.187
##  ARIMA(0,0,5) with zero mean     : 6098.999
##  ARIMA(0,0,5) with non-zero mean : 6059.039
##  ARIMA(1,0,0) with zero mean     : 6110.213
##  ARIMA(1,0,0) with non-zero mean : 6092.229
##  ARIMA(1,0,1) with zero mean     : 6110.815
##  ARIMA(1,0,1) with non-zero mean : 6090.062
##  ARIMA(1,0,2) with zero mean     : 6107.8
##  ARIMA(1,0,2) with non-zero mean : 6083.995
##  ARIMA(1,0,3) with zero mean     : 6091.745
##  ARIMA(1,0,3) with non-zero mean : 6057.339
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 6059.06
##  ARIMA(2,0,0) with zero mean     : 6110.515
##  ARIMA(2,0,0) with non-zero mean : 6088.399
##  ARIMA(2,0,1) with zero mean     : 6112.041
##  ARIMA(2,0,1) with non-zero mean : 6082.957
##  ARIMA(2,0,2) with zero mean     : 6106.526
##  ARIMA(2,0,2) with non-zero mean : 6071.829
##  ARIMA(2,0,3) with zero mean     : 6092.405
##  ARIMA(2,0,3) with non-zero mean : 6059.009
##  ARIMA(3,0,0) with zero mean     : 6109.361
##  ARIMA(3,0,0) with non-zero mean : 6080.34
##  ARIMA(3,0,1) with zero mean     : 6111.267
##  ARIMA(3,0,1) with non-zero mean : 6080.166
##  ARIMA(3,0,2) with zero mean     : 6115.199
##  ARIMA(3,0,2) with non-zero mean : 6067.443
##  ARIMA(4,0,0) with zero mean     : 6110.424
##  ARIMA(4,0,0) with non-zero mean : 6074.96
##  ARIMA(4,0,1) with zero mean     : 6106.144
##  ARIMA(4,0,1) with non-zero mean : 6071.252
##  ARIMA(5,0,0) with zero mean     : 6086.83
##  ARIMA(5,0,0) with non-zero mean : 6064.546
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
## [1] "input_series=data$sold_count"
## 
##  ARIMA(0,0,0) with zero mean     : 6608.296
##  ARIMA(0,0,0) with non-zero mean : 6428.606
##  ARIMA(0,0,1) with zero mean     : 6330.121
##  ARIMA(0,0,1) with non-zero mean : 6209.306
##  ARIMA(0,0,2) with zero mean     : 6244.493
##  ARIMA(0,0,2) with non-zero mean : 6155.067
##  ARIMA(0,0,3) with zero mean     : 6139.405
##  ARIMA(0,0,3) with non-zero mean : 6078.792
##  ARIMA(0,0,4) with zero mean     : 6117.614
##  ARIMA(0,0,4) with non-zero mean : 6072.074
##  ARIMA(0,0,5) with zero mean     : 6113.99
##  ARIMA(0,0,5) with non-zero mean : 6073.924
##  ARIMA(1,0,0) with zero mean     : 6125.247
##  ARIMA(1,0,0) with non-zero mean : 6107.211
##  ARIMA(1,0,1) with zero mean     : 6125.841
##  ARIMA(1,0,1) with non-zero mean : 6105.048
##  ARIMA(1,0,2) with zero mean     : 6122.815
##  ARIMA(1,0,2) with non-zero mean : 6098.955
##  ARIMA(1,0,3) with zero mean     : 6106.716
##  ARIMA(1,0,3) with non-zero mean : 6072.223
##  ARIMA(1,0,4) with zero mean     : Inf
##  ARIMA(1,0,4) with non-zero mean : 6073.946
##  ARIMA(2,0,0) with zero mean     : 6125.539
##  ARIMA(2,0,0) with non-zero mean : 6103.388
##  ARIMA(2,0,1) with zero mean     : 6127.066
##  ARIMA(2,0,1) with non-zero mean : 6097.953
##  ARIMA(2,0,2) with zero mean     : 6121.532
##  ARIMA(2,0,2) with non-zero mean : 6086.778
##  ARIMA(2,0,3) with zero mean     : 6107.374
##  ARIMA(2,0,3) with non-zero mean : 6073.896
##  ARIMA(3,0,0) with zero mean     : 6124.377
##  ARIMA(3,0,0) with non-zero mean : 6095.322
##  ARIMA(3,0,1) with zero mean     : 6126.283
##  ARIMA(3,0,1) with non-zero mean : 6095.148
##  ARIMA(3,0,2) with zero mean     : 6130.203
##  ARIMA(3,0,2) with non-zero mean : 6082.345
##  ARIMA(4,0,0) with zero mean     : 6125.44
##  ARIMA(4,0,0) with non-zero mean : 6089.931
##  ARIMA(4,0,1) with zero mean     : 6121.146
##  ARIMA(4,0,1) with non-zero mean : 6086.197
##  ARIMA(5,0,0) with zero mean     : 6101.778
##  ARIMA(5,0,0) with non-zero mean : 6079.455
## 
## 
## 
##  Best model: ARIMA(0,0,4) with non-zero mean 
## 
## [1] "input_series=ts(data$sold_count,freq=16)"
##             variable  n     mean       sd        CV      FBias      MAPE
## 1:    lm_prediction2 14 412.4286 232.3915 0.5634709 -4.6142064 5.1883257
## 2:    lm_prediction3 14 412.4286 232.3915 0.5634709 -4.7166790 5.3069999
## 3:    lm_prediction4 14 412.4286 232.3915 0.5634709 -4.5580423 5.0437342
## 4:    lm_prediction5 14 412.4286 232.3915 0.5634709  0.1458778 0.6688978
## 5:    lm_prediction6 14 412.4286 232.3915 0.5634709 -4.4771036 4.9528052
## 6:  arima_prediction 14 412.4286 232.3915 0.5634709 -0.3479633 0.7460511
## 7: sarima_prediction 14 412.4286 232.3915 0.5634709 -0.2817767 0.6716086
## 8:    selected_arima 14 412.4286 232.3915 0.5634709  0.1967600 0.6791326
##         RMSE       MAD      MADP     WMAPE
## 1: 2184.1016 1903.0305 4.6142064 4.6142064
## 2: 2238.7716 1945.2932 4.7166790 4.7166790
## 3: 2165.7375 1879.8669 4.5580423 4.5580423
## 4:  303.9193  245.5246 0.5953143 0.5953143
## 5: 2125.5963 1846.4854 4.4771036 4.4771036
## 6:  221.0945  188.7099 0.4575578 0.4575578
## 7:  203.6424  168.6188 0.4088437 0.4088437
## 8:  276.4342  230.1284 0.5579837 0.5579837

Smallest Weighted Mean Absolute Percentage Error is obtained for ARIMA(0,0,4) with 16 day frequency decomposition ,and it is the model that auto arima suggested. So further on this model is selected for our prediction purposes.

For conclusion, here is a plot of actual test set and predicted values of chosen model. As it can be seen, the predictions are are not too far.

One Day Ahead Prediction with the Selected Model for Product 5

With the selected model, 1 day ahead prediction can be performed using all the data on hand, since in this competition one day ahead prediction should be submitted.

##    price event_date product_content_id sold_count visit_count favored_count
## 1: 44.86 2021-07-01           31515569        483       10787           649
##    basket_count category_sold category_brand_sold category_visits ty_visits
## 1:         2046          7463                1510          418946 106491398
##    category_basket category_favored w_day mon is_campaign
## 1:           38687            37332     5   7           0
## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.4764 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.0238 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

## 
## Call:
## arima(x = detrend2, order = c(0, 0, 4), xreg = data_31515569$is_campaign, include.mean = TRUE)
## 
## Coefficients:
##          ma1     ma2     ma3      ma4  intercept  data_31515569$is_campaign
##       0.7895  0.5020  0.2626  -0.0056     0.9158                     0.6705
## s.e.  0.0538  0.0719  0.0723   0.0584     0.0542                     0.1015
## 
## sigma^2 estimated as 0.1698:  log likelihood = -206.42,  aic = 426.85
## [1] 426.8456
## [1] 454.5546
## Time Series:
## Start = c(26, 4) 
## End = c(26, 4) 
## Frequency = 16 
## [1] 249.9944
##    price event_date product_content_id sold_count visit_count favored_count
## 1: 44.86 2021-07-03           31515569        483       10787           649
##    basket_count category_sold category_brand_sold category_visits ty_visits
## 1:         2046          7463                1510          418946 106491398
##    category_basket category_favored w_day mon is_campaign arima1_prediction
## 1:           38687            37332     5   7           0          249.9944

PRODUCT 6 - TrendyolMilla Bikini Top

Before making forecasting models for product 6, it should be looked at the plot of data and examined the seasonalities and trend. Below, you can see the plot of sales quantity of Product 6.For the empty places in sold counts, the mean of the data is taken. There is a slightly increasing trend, especially in the beginning and end of the plot. There can’t be seen any significant seasonality. To look further, there is a plot of 3 months of 2021 - March, April and May -. Again, the seasonality isn’t significant. In conclusion, it can be said that there is no seasonality.

Linear Regression Model For Product 6

First type of model that is going to used is linear regression model. First of all, it would be wise to select attributes that will help to model from correlation matrix. Below, you can see the correlations between the attributes. According to this matrix, just basket_count can be added to the model.

In the first model, the attribute is added to the model. The adjusted R-squared value indicates whether model is good or not. The value for the first model is pretty high which is a good sign. But there are outliers which is probably due to campaigns and holidays. The outliers can be eliminated for a better model. Lastly, ‘lag1’ attribute can be added because it is very high in the ACF. In the final linear regression model, adjusted R-squared value is high enough and plots are good enough to make predictions.

## 
## Call:
## lm(formula = sold_count ~ basket_count, data = sold)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -22.044  -1.727   1.130   1.130  22.761 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  9.974707   0.936233   10.65   <2e-16 ***
## basket_count 0.126016   0.005341   23.60   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.745 on 367 degrees of freedom
## Multiple R-squared:  0.6027, Adjusted R-squared:  0.6016 
## F-statistic: 556.8 on 1 and 367 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 65.762, df = 10, p-value = 2.897e-10

##    sold_count   
##  Min.   : 1.00  
##  1st Qu.:32.00  
##  Median :32.86  
##  Mean   :30.45  
##  3rd Qu.:32.86  
##  Max.   :81.00
## 
## Call:
## lm(formula = sold_count ~ big_outlier + small_outlier + basket_count, 
##     data = sold)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.3260  -0.3618  -0.3618  -0.3618  18.0578 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    21.926503   0.755673   29.02   <2e-16 ***
## big_outlier     8.283919   0.779576   10.63   <2e-16 ***
## small_outlier -13.178828   0.582728  -22.62   <2e-16 ***
## basket_count    0.065431   0.004239   15.44   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.156 on 365 degrees of freedom
## Multiple R-squared:   0.85,  Adjusted R-squared:  0.8488 
## F-statistic: 689.6 on 3 and 365 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 21.904, df = 10, p-value = 0.0156
## 
## Call:
## lm(formula = sold_count ~ lag1 + big_outlier + small_outlier + 
##     basket_count, data = sold)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.9443  -0.3295  -0.3295  -0.3295  15.8885 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    22.054336   0.742197  29.715  < 2e-16 ***
## lag1            0.201297   0.051769   3.888  0.00012 ***
## big_outlier     8.315454   0.764965  10.870  < 2e-16 ***
## small_outlier -13.404209   0.574705 -23.324  < 2e-16 ***
## basket_count    0.064925   0.004161  15.601  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.078 on 364 degrees of freedom
## Multiple R-squared:  0.856,  Adjusted R-squared:  0.8544 
## F-statistic:   541 on 4 and 364 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 10
## 
## data:  Residuals
## LM test = 10.655, df = 10, p-value = 0.385

Arima Model For Product 6

Second type of model that is going to build is ARIMA model. For this model, in the beginning, the data should be decomposed. Firstly, a frequency value should be chosen. Since there is no significant seasonality, the highest value in the ACF will be chosen which is 9. Additive type of decomposition will be used for this task. Below, the random series can be seen.

After the decomposition, (p,d,q) values should be chosen for the model. For this task, ACF and PACF will be examined. Looking at the ACF, for ‘q’ value 3 can be chosen and looking at the PACF, for ‘p’ value 3 or 6 can be chosen. Also, auto.arima function is used as well. The AIC and BIC values of models that are suggested can be seen below. Looking at AIC and BIC values, (6,0,3) model is best among them. After the model is selected, the regressors that most correlates with the sold count are added to model to make it better. In the final model, the AIC and BIC values are lower. We can proceed with this model.

## 
## Call:
## arima(x = detrend, order = c(3, 0, 3))
## 
## Coefficients:
##          ar1      ar2      ar3      ma1     ma2     ma3  intercept
##       0.9007  -0.0338  -0.3890  -1.1415  -0.169  0.3105    -0.0022
## s.e.  0.5949   0.8402   0.4472   0.5991   0.988  0.3941     0.0030
## 
## sigma^2 estimated as 31.95:  log likelihood = -1141.11,  aic = 2298.23
## [1] 2298.226
## [1] 2329.337
## 
## Call:
## arima(x = detrend, order = c(6, 0, 3))
## 
## Coefficients:
##          ar1     ar2      ar3      ar4      ar5      ar6      ma1      ma2
##       0.3837  0.1171  -0.3832  -0.1001  -0.0044  -0.2076  -0.6075  -0.4494
## s.e.  0.2530  0.2258   0.1678   0.0954   0.0762   0.0602   0.2598   0.2562
##          ma3  intercept
##       0.0571    -0.0023
## s.e.  0.2283     0.0032
## 
## sigma^2 estimated as 31.23:  log likelihood = -1137.01,  aic = 2296.01
## [1] 2296.013
## [1] 2338.791
## Series: detrend 
## ARIMA(0,0,1) with non-zero mean 
## 
## Coefficients:
##          ma1     mean
##       0.2199  -0.0189
## s.e.  0.0486   0.4640
## 
## sigma^2 estimated as 52.58:  log likelihood=-1226.45
## AIC=2458.9   AICc=2458.97   BIC=2470.57
## [1] 2458.899
## [1] 2470.565
## 
## Call:
## arima(x = detrend, order = c(6, 0, 3), xreg = xreg)
## 
## Coefficients:
##          ar1     ar2      ar3     ar4     ar5      ar6      ma1      ma2
##       0.6351  0.2570  -0.6182  0.0153  0.0723  -0.1588  -0.8719  -0.5285
## s.e.  0.2349  0.2836   0.2008  0.0886  0.0820   0.0811   0.2360   0.3094
##          ma3  intercept    xreg
##       0.4429    -0.3081  0.0018
## s.e.  0.2606     0.1347  0.0008
## 
## sigma^2 estimated as 30.8:  log likelihood = -1132.94,  aic = 2289.89
## [1] 2289.885
## [1] 2336.552

Comparison Of Models

We selected two models for prediction. Here, it can be seen their accuracy values. According to box plot, the weighted mean absolute errors for Arima model is higher. We should choose Linear model because WMAPE value of the model is lower which is a sign for better model.

##          variable  n     mean       sd       CV      FBias      MAPE     RMSE
## 1:  lm_prediction 14 50.71429 11.75015 0.231693 0.07380659 0.1490211 10.36884
## 2: selected_arima 14 50.71429 11.75015 0.231693 0.05651922 0.2509979 15.59195
##          MAD      MADP     WMAPE
## 1:  8.003332 0.1578122 0.1578122
## 2: 12.670955 0.2498498 0.2498498

For conclusion, here is a plot of actual test set and predicted values of chosen model. As it can be seen, the predictions are pretty accurate.

Product 7

Oral-B Rechargeable ToothBrush

First of all, the general behaviour of data is examined during the day by time plot.

Secocondly, the distribution in days and months is plotted to see if it is changed depend on month and day.

Finally , by ACF and PACF graph, the relationship between previous observations is observed.

It can be say that, there is a trend in data, and if trend factor is excluded, the autocorrelation between lag1, lag3 and lag is significant.

The data is depend on month and day factor by observing boxplot of data. Since the day factor is significant, day factor will be used in model construction instead of lag7 and the frequency of data determined as 7.

Examination of Attributes

The some of the attributes of data is not reliable, therefore, it is examined by summary of data.

##      price         event_date         product_content_id   sold_count    
##  Min.   :110.1   Min.   :2020-05-25   Length:404         Min.   :  0.00  
##  1st Qu.:129.9   1st Qu.:2020-09-02   Class :character   1st Qu.: 20.00  
##  Median :136.2   Median :2020-12-12   Mode  :character   Median : 57.00  
##  Mean   :135.3   Mean   :2020-12-12                      Mean   : 94.93  
##  3rd Qu.:141.6   3rd Qu.:2021-03-23                      3rd Qu.:139.50  
##  Max.   :165.9   Max.   :2021-07-02                      Max.   :513.00  
##  NA's   :9                                                               
##   visit_count    favored_count     basket_count    category_sold   
##  Min.   :    0   Min.   :   0.0   Min.   :   0.0   Min.   : 321.0  
##  1st Qu.:    0   1st Qu.:   0.0   1st Qu.:  92.0   1st Qu.: 609.2  
##  Median :    0   Median : 171.5   Median : 239.5   Median : 804.5  
##  Mean   : 2270   Mean   : 357.6   Mean   : 399.7   Mean   :1008.6  
##  3rd Qu.: 4320   3rd Qu.: 593.8   3rd Qu.: 578.0   3rd Qu.:1101.0  
##  Max.   :15725   Max.   :2696.0   Max.   :2249.0   Max.   :5557.0  
##                                                                    
##  category_brand_sold category_visits     ty_visits         category_basket 
##  Min.   :    0.0     Min.   :  346.0   Min.   :        1   Min.   :     0  
##  1st Qu.:    0.0     1st Qu.:  656.5   1st Qu.:        1   1st Qu.:     0  
##  Median :  680.5     Median :  879.0   Median :        1   Median :     0  
##  Mean   : 2996.9     Mean   : 3845.8   Mean   : 44617481   Mean   : 18632  
##  3rd Qu.: 5355.5     3rd Qu.: 1343.8   3rd Qu.:102350467   3rd Qu.: 41373  
##  Max.   :28944.0     Max.   :59310.0   Max.   :178545693   Max.   :281022  
##                                                                            
##  category_favored     w_day        mon          is_campaign     
##  Min.   : 1242    Min.   :1   Min.   : 1.000   Min.   :0.00000  
##  1st Qu.: 2476    1st Qu.:2   1st Qu.: 4.000   1st Qu.:0.00000  
##  Median : 3286    Median :4   Median : 6.000   Median :0.00000  
##  Mean   : 4208    Mean   :4   Mean   : 6.463   Mean   :0.08663  
##  3rd Qu.: 4886    3rd Qu.:6   3rd Qu.: 9.000   3rd Qu.:0.00000  
##  Max.   :44445    Max.   :7   Max.   :12.000   Max.   :1.00000  
## 
##         price sold_count visit_count favored_count basket_count category_sold
## [1,] 112.9000          0         0.0           0.0          0.0         321.0
## [2,] 129.9000         20         0.0           0.0         92.0         608.5
## [3,] 136.2475         57         0.0         171.5        239.5         804.5
## [4,] 141.6109        140      4374.5         594.5        578.0        1103.0
## [5,] 158.1300        315     10777.0        1465.0       1287.0        1799.0
##      category_brand_sold category_visits ty_visits category_basket
## [1,]                 0.0           346.0         1             0.0
## [2,]                 0.0           656.0         1             0.0
## [3,]               680.5           879.0         1             0.0
## [4,]              5357.0          1345.5 102370187         41481.5
## [5,]             12868.0          2348.0 178545693        103254.0
##      category_favored w_day
## [1,]           1242.0     1
## [2,]           2475.5     2
## [3,]           3286.5     4
## [4,]           4887.0     6
## [5,]           8278.0     7

The relationship of attributes and response variable is observed by correlation grapgh.

Basket_count, category_visits and category_favored has high correlation and it seems reliable data from summary of data.However, there is 0 values which is not expected in real life therefore, the zero values are changed as mean.

ty_visits also has 1 value before particular date and it is changed as mean of ty_visits.

Some price values are NA, and they are changed as mean of price since price is not has significant changes during the time.

In the end, “price”,“visit_count”, “basket_count”,“category_favored” , “ty_visits”,“is_campaign” values determined as regressors.

the data will be predicted based on previous observations attributes since the real attributes not available for prediction time.

model construction

the data has no constant variance therefore, besides the simple linear model, the sqrt transformation and boxcox tranformation is used for simple regression model

simple linear regression with no transformation By many iterations, it is seen that day factor is not significant as is expected.

## 
## Call:
## lm(formula = sold_count ~ price + visit_count + basket_count + 
##     category_basket + factor(mon) + factor(is_campaign) + trend + 
##     lag1 + lag3, data = train7)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -120.315   -9.293   -0.319    7.725  121.574 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           6.476e+01  3.327e+01   1.947 0.052322 .  
## price                -7.719e-01  2.386e-01  -3.236 0.001322 ** 
## visit_count          -1.034e-02  1.641e-03  -6.299 8.55e-10 ***
## basket_count          2.255e-01  1.013e-02  22.268  < 2e-16 ***
## category_basket       2.710e-04  8.130e-05   3.334 0.000944 ***
## factor(mon)2         -8.302e+00  8.171e+00  -1.016 0.310287    
## factor(mon)3         -1.868e+01  7.361e+00  -2.537 0.011594 *  
## factor(mon)4         -1.234e+01  8.376e+00  -1.474 0.141467    
## factor(mon)5          2.579e+01  8.131e+00   3.171 0.001644 ** 
## factor(mon)6          2.349e+01  6.665e+00   3.524 0.000479 ***
## factor(mon)7          2.550e+01  7.484e+00   3.408 0.000727 ***
## factor(mon)8          1.680e+01  7.168e+00   2.344 0.019617 *  
## factor(mon)9         -1.049e+00  7.681e+00  -0.137 0.891425    
## factor(mon)10        -4.350e-01  6.756e+00  -0.064 0.948696    
## factor(mon)11         3.925e+00  6.535e+00   0.601 0.548474    
## factor(mon)12        -3.041e+00  5.781e+00  -0.526 0.599142    
## factor(is_campaign)1  1.590e+00  4.738e+00   0.335 0.737444    
## trend                 1.819e-01  2.528e-02   7.195 3.52e-12 ***
## lag1                  1.734e-01  2.752e-02   6.302 8.38e-10 ***
## lag3                  4.396e-02  2.184e-02   2.013 0.044846 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.41 on 369 degrees of freedom
## Multiple R-squared:  0.9505, Adjusted R-squared:  0.9479 
## F-statistic: 372.9 on 19 and 369 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 23
## 
## data:  Residuals
## LM test = 59.442, df = 23, p-value = 4.599e-05

the residuals analysis is good for lm model with no significant autocorrelation around mean zero, however, the variablity of error in higher values is higher.

simple linear regression with sqrt() transformation

By many iterations, the

## 
## Call:
## lm(formula = sqrt ~ price + visit_count + basket_count + ty_visits + 
##     factor(mon) + lag1 + factor(is_campaign) + category_visits + 
##     category_basket, data = train7)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.0284 -0.6227  0.0620  0.6736  3.7829 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           1.415e+01  1.818e+00   7.785 7.09e-14 ***
## price                -7.308e-02  1.230e-02  -5.942 6.51e-09 ***
## visit_count          -9.948e-04  9.367e-05 -10.620  < 2e-16 ***
## basket_count          1.137e-02  5.484e-04  20.728  < 2e-16 ***
## ty_visits             4.261e-08  4.811e-09   8.856  < 2e-16 ***
## factor(mon)2         -3.184e-01  4.806e-01  -0.662 0.508094    
## factor(mon)3         -4.883e-01  4.251e-01  -1.148 0.251532    
## factor(mon)4         -5.228e-01  4.786e-01  -1.092 0.275354    
## factor(mon)5         -7.008e-01  4.598e-01  -1.524 0.128294    
## factor(mon)6         -1.601e+00  3.481e-01  -4.600 5.81e-06 ***
## factor(mon)7         -1.903e+00  3.514e-01  -5.415 1.10e-07 ***
## factor(mon)8         -1.660e+00  3.690e-01  -4.498 9.22e-06 ***
## factor(mon)9         -1.587e+00  4.211e-01  -3.769 0.000191 ***
## factor(mon)10        -1.692e+00  3.656e-01  -4.627 5.14e-06 ***
## factor(mon)11        -1.183e+00  3.550e-01  -3.333 0.000947 ***
## factor(mon)12        -3.677e-01  3.134e-01  -1.173 0.241425    
## lag1                  1.216e-02  1.313e-03   9.265  < 2e-16 ***
## factor(is_campaign)1 -2.645e-01  2.579e-01  -1.026 0.305615    
## category_visits       4.094e-05  1.236e-05   3.312 0.001018 ** 
## category_basket      -9.950e-06  4.780e-06  -2.082 0.038066 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.217 on 369 degrees of freedom
## Multiple R-squared:  0.9401, Adjusted R-squared:  0.937 
## F-statistic: 304.7 on 19 and 369 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 23
## 
## data:  Residuals
## LM test = 94.014, df = 23, p-value = 1.492e-10

the residuals analysis is model with significant autocorrelation in lag1 around mean zero, however, the variablity of error in higher values is higher. It is poor by comparing lm model with no transformation.

simple linear regression with BoxCox transformation

By many iterations, it is seen that day factornot significant but lag7 is significant and lag3 factor is not significant for boxcox linear model therefore, they excluded and the category_basket is significant for boxcox transformation.

## 
## Call:
## lm(formula = BoxCox ~ price + visit_count + basket_count + category_favored + 
##     ty_visits + factor(mon) + lag1 + lag7 + factor(is_campaign) + 
##     category_basket, data = train7)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.9652 -0.4527  0.1688  0.6840  2.5776 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           1.395e+01  2.073e+00   6.729 6.57e-11 ***
## price                -7.492e-02  1.411e-02  -5.309 1.91e-07 ***
## visit_count          -6.764e-04  1.138e-04  -5.943 6.49e-09 ***
## basket_count          6.016e-03  6.815e-04   8.828  < 2e-16 ***
## category_favored      1.495e-04  3.967e-05   3.768 0.000191 ***
## ty_visits             4.365e-08  4.450e-09   9.808  < 2e-16 ***
## factor(mon)2          2.850e-01  5.531e-01   0.515 0.606758    
## factor(mon)3         -2.757e-02  4.771e-01  -0.058 0.953947    
## factor(mon)4         -7.028e-01  5.234e-01  -1.343 0.180211    
## factor(mon)5         -1.665e+00  5.160e-01  -3.227 0.001364 ** 
## factor(mon)6         -2.033e+00  3.871e-01  -5.253 2.54e-07 ***
## factor(mon)7         -2.211e+00  4.032e-01  -5.484 7.75e-08 ***
## factor(mon)8         -1.558e+00  4.208e-01  -3.702 0.000246 ***
## factor(mon)9         -1.550e+00  4.797e-01  -3.230 0.001348 ** 
## factor(mon)10        -1.507e+00  4.150e-01  -3.631 0.000322 ***
## factor(mon)11        -1.351e+00  4.104e-01  -3.291 0.001095 ** 
## factor(mon)12        -3.002e-01  3.570e-01  -0.841 0.400932    
## lag1                  7.618e-03  1.497e-03   5.088 5.78e-07 ***
## lag7                  3.144e-03  1.180e-03   2.665 0.008048 ** 
## factor(is_campaign)1 -4.970e-01  3.062e-01  -1.623 0.105370    
## category_basket      -3.579e-05  6.827e-06  -5.243 2.68e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.382 on 368 degrees of freedom
## Multiple R-squared:  0.8266, Adjusted R-squared:  0.8172 
## F-statistic: 87.74 on 20 and 368 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 24
## 
## data:  Residuals
## LM test = 178.01, df = 24, p-value < 2.2e-16

By residuals analysis, boxcox model has big deviation in time and adjusted R-squared value is lower than others.

Arima Models

When arima models is constructed, the auto.arima function is used, and in every day the auto.arima function is runs again. the seasonality is TRUE, and frequency is determined as seven by observing ACF and PACF graph.

Additive Model, Multplive Model, and linear regression model is used for decomposition and get stationary data.

## [1] "The Additive Model"
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0074
## [1] "The Multiplicative Model"
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.2026
## [1] "Linear Regression"
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.024

the multiplive model is not significant, therefore I will use the addtive decomposition for arima and arima regressors models.

the linear regression model residuals are stationary therefore, the residuals use for arima model and they combined in the end.

the regressors mentioned above is used for arima model with regressors.

## Series: decomposed$random 
## ARIMA(0,0,1)(0,0,2)[7] with non-zero mean 
## 
## Coefficients:
##          ma1    sma1     sma2    mean
##       0.3223  0.0894  -0.0959  0.0053
## s.e.  0.0479  0.0523   0.0518  2.2771
## 
## sigma^2 estimated as 1160:  log likelihood=-1892.85
## AIC=3795.7   AICc=3795.86   BIC=3815.44

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,1)(0,0,2)[7] with non-zero mean
## Q* = 54.534, df = 10, p-value = 3.858e-08
## 
## Model df: 4.   Total lags used: 14

By observing, pacf is significant at lag1 and acf drops after lag1 therfore, it is reasonable auto.arima gives the MA(1). And at lag2 as seasonal the pacf and acf is significant, the seasonal order(0,0,2) is reasonable, too.

## Series: decomposed$random 
## Regression with ARIMA(5,1,1) errors 
## 
## Coefficients:
##          ar1      ar2      ar3      ar4      ar5      ma1     xreg
##       0.1616  -0.3492  -0.3004  -0.0625  -0.2538  -0.9816  -0.5042
## s.e.  0.0503   0.0506   0.0513   0.0505   0.0504   0.0143   0.2151
## 
## sigma^2 estimated as 932.9:  log likelihood=-1847.27
## AIC=3710.55   AICc=3710.94   BIC=3742.11

## 
##  Ljung-Box test
## 
## data:  Residuals from Regression with ARIMA(5,1,1) errors
## Q* = 19.474, df = 7, p-value = 0.006825
## 
## Model df: 7.   Total lags used: 14
## [1] 3710.549

By residual analysis, the arima with regressors has no autocorrelated residuals and lower AIC, therefore arima with regressors is better model than arima.

Arima combined with linear Regression

## Series: residuals 
## ARIMA(0,0,3) with zero mean 
## 
## Coefficients:
##          ma1     ma2     ma3
##       0.1704  0.1537  0.0900
## s.e.  0.0503  0.0508  0.0536
## 
## sigma^2 estimated as 453.5:  log likelihood=-1740.24
## AIC=3488.48   AICc=3488.59   BIC=3504.34

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,3) with zero mean
## Q* = 3.569, df = 7, p-value = 0.8279
## 
## Model df: 3.   Total lags used: 10

the auto arima model on residuals give zero mean and no autocorrelated residuals and lower AIC valu, it is better than arima and arima regressor model by residual analysis.

Predictions

The predictions based on the last available attributes, and the predictions plotted with actual sales values.

##     event_date actual sqrt_forecasted_sold BoxCox_forecasted_sold
##  1: 2021-06-18    108             75.30678               38.61865
##  2: 2021-06-19    104             85.18752               51.05261
##  3: 2021-06-20    149            142.34503              109.50590
##  4: 2021-06-21    128            116.80589              116.12060
##  5: 2021-06-22     56             97.96617               86.86636
##  6: 2021-06-23     59             65.41016               57.38209
##  7: 2021-06-24     56             63.05782               54.50673
##  8: 2021-06-25     36             55.53486               45.66327
##  9: 2021-06-26     40             52.72771               41.84033
## 10: 2021-06-27     46             72.90609               72.81313
## 11: 2021-06-28     64             73.59562               67.60311
## 12: 2021-06-29    137            120.37701              114.68402
## 13: 2021-06-30    131            133.14290              129.73114
## 14: 2021-07-01    130            106.68231               90.71156
##     lm_forecasted_sold forecasted_lm7_arima add_arima_forecasted
##  1:          130.99117            129.75684            159.88214
##  2:          122.10168            116.96945            151.61943
##  3:          156.84938            152.07316            145.17374
##  4:          130.65973            126.83063            158.80660
##  5:          108.49405            107.28952            154.81502
##  6:           82.48194             74.54178            134.35688
##  7:           77.32646             67.77471            120.78775
##  8:           69.69601             61.49438             97.99568
##  9:           70.19335             63.37959             74.91833
## 10:           67.03158             58.76225             57.04774
## 11:           79.05786             71.53833             52.04136
## 12:          131.06642            126.11501             55.21875
## 13:          140.91006            140.62089             74.90336
## 14:          139.55967            139.03486             87.46261
##     reg_add_arima_forecasted
##  1:                143.40567
##  2:                132.43593
##  3:                149.56207
##  4:                176.87823
##  5:                132.44178
##  6:                140.35844
##  7:                109.98056
##  8:                 88.86555
##  9:                 79.87113
## 10:                 67.76062
## 11:                 56.92694
## 12:                 51.75578
## 13:                 74.92831
## 14:                 84.87210

EROR rate of Models

##                       model  n     mean       sd        CV       FBias
## 1:     sqrt_forecasted_sold 14 88.85714 41.34072 0.4652492 -0.01370246
## 2:   BoxCox_forecasted_sold 14 88.85714 41.34072 0.4652492  0.13416438
## 3:       lm_forecasted_sold 14 88.85714 41.34072 0.4652492 -0.21094803
## 4:     forecasted_lm7_arima 14 88.85714 41.34072 0.4652492 -0.15448667
## 5:     add_arima_forecasted 14 88.85714 41.34072 0.4652492 -0.22590787
## 6: reg_add_arima_forecasted 14 88.85714 41.34072 0.4652492 -0.19778385
##         MAPE     RMSE      MAD      MADP     WMAPE
## 1: 0.2508952 20.05112 16.83120 0.1894186 0.1894186
## 2: 0.2530779 30.64493 22.31949 0.2511840 0.2511840
## 3: 0.3394596 23.34176 19.59189 0.2204876 0.2204876
## 4: 0.2611267 19.58886 15.44930 0.1738667 0.1738667
## 5: 0.6984139 55.12053 48.10213 0.5413423 0.5413423
## 6: 0.6529313 51.53582 45.21977 0.5089042 0.5089042

Since the arima model combined model has the lowest WMAPE value it is selected for prediction. However, In every day, the error rates are calculated for last 14 days and the model predictions and the model prediction has the lowest WMAPE value of is selected.

Predictions of Next Day

##         add_arima    xreg_add_arima       forecast_lm forecast_lm_arima 
##          87.70957          83.45005         141.31027         143.38729 
##         BoxCox_lm           Sqrt_lm 
##          93.26515         108.00000

Product 8

Altinyildiz Classics Jacket

It can be seen that the sales is zero most of time, however, there is huge increase in October.

The ACF and PACF of data shows that there is significant autocorrelation in lag1 and lag7.

the Examination of Attirubutes

the correlation of price, visit_count, and basket_count is high and it is expected if the sold_count is zero this variables can be zero.

However, it is not expected that category favored and trendyol visits is zero or one therefore these variables changed as mean.

##      price         event_date         product_content_id   sold_count     
##  Min.   : -1.0   Min.   :2020-05-25   Length:404         Min.   : 0.0000  
##  1st Qu.:350.0   1st Qu.:2020-09-02   Class :character   1st Qu.: 0.0000  
##  Median :600.0   Median :2020-12-12   Mode  :character   Median : 0.0000  
##  Mean   :557.9   Mean   :2020-12-12                      Mean   : 0.9233  
##  3rd Qu.:736.6   3rd Qu.:2021-03-23                      3rd Qu.: 0.0000  
##  Max.   :833.3   Max.   :2021-07-02                      Max.   :52.0000  
##  NA's   :303                                                              
##   visit_count     favored_count     basket_count     category_sold   
##  Min.   :  0.00   Min.   : 0.000   Min.   :  0.000   Min.   :   0.0  
##  1st Qu.:  0.00   1st Qu.: 0.000   1st Qu.:  0.000   1st Qu.:  16.0  
##  Median :  0.00   Median : 0.000   Median :  0.000   Median :  45.0  
##  Mean   : 27.07   Mean   : 2.238   Mean   :  5.819   Mean   : 198.6  
##  3rd Qu.:  3.00   3rd Qu.: 2.000   3rd Qu.:  5.000   3rd Qu.: 108.8  
##  Max.   :516.00   Max.   :37.000   Max.   :247.000   Max.   :3299.0  
##                                                                      
##  category_brand_sold category_visits    ty_visits         category_basket  
##  Min.   :     0      Min.   :   367   Min.   :        1   Min.   :      0  
##  1st Qu.:     0      1st Qu.:  1424   1st Qu.:        1   1st Qu.:      0  
##  Median :     6      Median :  5305   Median :        1   Median :      0  
##  Mean   : 46361      Mean   : 27422   Mean   : 44617481   Mean   : 353883  
##  3rd Qu.: 94565      3rd Qu.:  9521   3rd Qu.:102350467   3rd Qu.: 469103  
##  Max.   :259590      Max.   :583672   Max.   :178545693   Max.   :3102147  
##                                                                            
##  category_favored     w_day        mon          is_campaign     
##  Min.   :  2324   Min.   :1   Min.   : 1.000   Min.   :0.00000  
##  1st Qu.:  8562   1st Qu.:2   1st Qu.: 4.000   1st Qu.:0.00000  
##  Median : 24608   Median :4   Median : 6.000   Median :0.00000  
##  Mean   : 33744   Mean   :4   Mean   : 6.463   Mean   :0.08663  
##  3rd Qu.: 50363   3rd Qu.:6   3rd Qu.: 9.000   3rd Qu.:0.00000  
##  Max.   :244883   Max.   :7   Max.   :12.000   Max.   :1.00000  
## 
##       price sold_count visit_count favored_count basket_count category_sold
## [1,]  -1.00          0           0             0            0           0.0
## [2,] 349.99          0           0             0            0          16.0
## [3,] 599.98          0           0             0            0          45.0
## [4,] 736.64          0           3             2            5         109.5
## [5,] 833.32          0           7             5           12         248.0
##      category_brand_sold category_visits ty_visits category_basket
## [1,]                   0           367.0         1               0
## [2,]                   0          1417.0         1               0
## [3,]                   6          5305.0         1               0
## [4,]               94567          9526.5 102370187          473826
## [5,]              235840         21187.0 178545693         1177469
##      category_favored w_day
## [1,]           2324.0     1
## [2,]           8506.5     2
## [3,]          24608.0     4
## [4,]          50385.0     6
## [5,]         111346.0     7

By considering correlation and variable relaibility the “price”,“visit_count”, “basket_count”,“category_favored” are selected as regressors.

The acf and pacf garph is shows high correlation in lag1,lag2,lag5 and lag7 therefore they are added as attirbutes.

Since Jacket is expensive product, it is expected that consumers consider the previous price of jacket. Therefore, previous prices of Jacket is examined.

the data will be predicted based on previous observations attributes since the real attributes not available for prediction time.

model construction

the data has no constant variance therefore, besides the simple linear model, the sqrt transformation and boxcox tranformation is used for simple regression model

Simple Regression

By many iterations, it is seen that most significant variables are price, visit_count, basket_count, category_favored,factor( w_day ), factor(mon),lag1,lag2,price_lag_4.

## 
## Call:
## lm(formula = sold_count ~ price + visit_count + basket_count + 
##     category_favored + factor(w_day) + factor(mon) + lag1 + lag2 + 
##     price_lag_4, data = train8)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.7090 -0.2920 -0.0360  0.3066  6.5977 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       1.389e+00  3.511e-01   3.957 9.14e-05 ***
## price             1.416e-03  4.236e-04   3.343 0.000915 ***
## visit_count       1.204e-03  1.175e-03   1.024 0.306421    
## basket_count      1.880e-01  4.495e-03  41.812  < 2e-16 ***
## category_favored -2.514e-05  3.484e-06  -7.214 3.17e-12 ***
## factor(w_day)2    4.494e-01  2.255e-01   1.993 0.046974 *  
## factor(w_day)3    3.319e-01  2.267e-01   1.464 0.144090    
## factor(w_day)4    5.809e-01  2.267e-01   2.562 0.010809 *  
## factor(w_day)5    4.630e-01  2.285e-01   2.027 0.043435 *  
## factor(w_day)6    2.596e-01  2.287e-01   1.135 0.257099    
## factor(w_day)7    1.589e-01  2.264e-01   0.702 0.483234    
## factor(mon)2     -5.283e-02  3.156e-01  -0.167 0.867132    
## factor(mon)3     -4.204e-01  3.092e-01  -1.360 0.174786    
## factor(mon)4     -8.103e-01  3.275e-01  -2.474 0.013805 *  
## factor(mon)5     -1.083e+00  3.565e-01  -3.037 0.002559 ** 
## factor(mon)6     -1.580e+00  3.553e-01  -4.448 1.15e-05 ***
## factor(mon)7     -1.510e+00  3.717e-01  -4.062 5.96e-05 ***
## factor(mon)8     -1.396e+00  3.642e-01  -3.832 0.000149 ***
## factor(mon)9     -1.271e+00  3.529e-01  -3.600 0.000362 ***
## factor(mon)10     7.597e-01  4.539e-01   1.674 0.095031 .  
## factor(mon)11    -1.493e+00  3.547e-01  -4.211 3.21e-05 ***
## factor(mon)12    -8.584e-02  3.053e-01  -0.281 0.778747    
## lag1              2.452e-03  2.144e-02   0.114 0.909006    
## lag2             -7.755e-02  2.114e-02  -3.669 0.000280 ***
## price_lag_4      -2.380e-03  3.866e-04  -6.155 1.99e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.186 on 364 degrees of freedom
## Multiple R-squared:  0.8921, Adjusted R-squared:  0.885 
## F-statistic: 125.4 on 24 and 364 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 28
## 
## data:  Residuals
## LM test = 148.81, df = 28, p-value < 2.2e-16

Simple Linear Regression with sqrt() transformation

By many iteration, it is seen that the plag_4 and lag_2 is not significant for sqrt transformation model, lag5 is significant.

## 
## Call:
## lm(formula = sqrt ~ price + visit_count + basket_count + category_favored + 
##     factor(w_day) + factor(mon) + lag1 + lag5, data = train8)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.12605 -0.07060  0.00275  0.05667  1.38137 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      -1.370e-01  8.516e-02  -1.609 0.108438    
## price             1.857e-03  1.034e-04  17.952  < 2e-16 ***
## visit_count       1.832e-03  2.862e-04   6.399 4.79e-10 ***
## basket_count      2.600e-02  1.102e-03  23.586  < 2e-16 ***
## category_favored -2.486e-06  8.787e-07  -2.829 0.004932 ** 
## factor(w_day)2    1.254e-01  5.586e-02   2.244 0.025416 *  
## factor(w_day)3    7.681e-02  5.590e-02   1.374 0.170300    
## factor(w_day)4    9.302e-02  5.588e-02   1.665 0.096815 .  
## factor(w_day)5    4.764e-02  5.629e-02   0.846 0.397907    
## factor(w_day)6    3.585e-02  5.621e-02   0.638 0.524020    
## factor(w_day)7    1.142e-02  5.586e-02   0.205 0.838048    
## factor(mon)2     -1.245e-01  7.782e-02  -1.599 0.110577    
## factor(mon)3     -6.330e-02  7.628e-02  -0.830 0.407146    
## factor(mon)4     -9.575e-02  8.092e-02  -1.183 0.237477    
## factor(mon)5      5.763e-02  8.818e-02   0.654 0.513796    
## factor(mon)6     -1.786e-01  8.796e-02  -2.031 0.042991 *  
## factor(mon)7     -1.623e-01  9.222e-02  -1.759 0.079338 .  
## factor(mon)8     -1.519e-01  9.033e-02  -1.682 0.093455 .  
## factor(mon)9     -1.531e-01  8.745e-02  -1.750 0.080882 .  
## factor(mon)10     1.325e-02  1.041e-01   0.127 0.898775    
## factor(mon)11    -2.991e-01  8.470e-02  -3.531 0.000466 ***
## factor(mon)12    -7.630e-02  7.495e-02  -1.018 0.309349    
## lag1              2.297e-02  5.201e-03   4.417 1.32e-05 ***
## lag5             -8.736e-03  4.947e-03  -1.766 0.078278 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2925 on 365 degrees of freedom
## Multiple R-squared:  0.8944, Adjusted R-squared:  0.8878 
## F-statistic: 134.5 on 23 and 365 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 27
## 
## data:  Residuals
## LM test = 137.99, df = 27, p-value < 2.2e-16

In residual analysis there is no significant difference, and adjusted R-square value of squared transformation is higher.

Simple Linear Regression Model with BoxCox Transformation By many iteration, price, visit_count, basket_count, category_favored, factor( w_day ), factor(mon), lag1 are most significant variables for Simple Linear Regression Model with BoxCox Transformation.

## 
## Call:
## lm(formula = BoxCox ~ price + visit_count + basket_count + category_favored + 
##     factor(w_day) + factor(mon) + lag1, data = train8)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.3536 -0.1568 -0.0167  0.1038  3.2974 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      -6.179e+00  2.210e-01 -27.956  < 2e-16 ***
## price             8.694e-03  2.683e-04  32.408  < 2e-16 ***
## visit_count       7.182e-03  7.454e-04   9.636  < 2e-16 ***
## basket_count      3.019e-02  2.854e-03  10.580  < 2e-16 ***
## category_favored -1.417e-06  2.236e-06  -0.634   0.5267    
## factor(w_day)2    2.386e-01  1.452e-01   1.643   0.1013    
## factor(w_day)3    1.673e-01  1.459e-01   1.146   0.2524    
## factor(w_day)4    1.286e-01  1.460e-01   0.881   0.3791    
## factor(w_day)5    1.257e-02  1.471e-01   0.085   0.9319    
## factor(w_day)6    9.155e-02  1.468e-01   0.624   0.5333    
## factor(w_day)7   -1.666e-02  1.459e-01  -0.114   0.9092    
## factor(mon)2     -5.102e-01  2.033e-01  -2.510   0.0125 *  
## factor(mon)3     -1.734e-01  1.991e-01  -0.871   0.3844    
## factor(mon)4     -1.763e-01  2.108e-01  -0.836   0.4035    
## factor(mon)5      9.359e-01  2.291e-01   4.086 5.40e-05 ***
## factor(mon)6     -1.883e-01  2.283e-01  -0.825   0.4099    
## factor(mon)7     -2.038e-01  2.390e-01  -0.853   0.3943    
## factor(mon)8     -1.983e-01  2.342e-01  -0.847   0.3977    
## factor(mon)9     -2.341e-01  2.270e-01  -1.031   0.3032    
## factor(mon)10    -2.322e-01  2.712e-01  -0.856   0.3925    
## factor(mon)11    -5.278e-01  2.150e-01  -2.455   0.0146 *  
## factor(mon)12    -1.966e-01  1.958e-01  -1.004   0.3159    
## lag1              5.606e-02  1.350e-02   4.153 4.09e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7642 on 366 degrees of freedom
## Multiple R-squared:  0.919,  Adjusted R-squared:  0.9142 
## F-statistic: 188.9 on 22 and 366 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 26
## 
## data:  Residuals
## LM test = 119.14, df = 26, p-value = 6.999e-14

In residual analysis and adjusted R-squared comparison BoxCox is better than others, however, it is very sensitive to back transformation, therefore, maybe predictions can be poor.

Arima Models

When arima models is constructed, the auto.arima function is used, and in every day the auto.arima function is runs again. the seasonality is TRUE, and frequency is determined as seven by observing ACF and PACF graph.

Additive Model, Multplive Model, and linear regression model is used for decomposition and get stationary data.

## [1] "The Additive Model"
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0092
## [1] "The Multiplicative Model"
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0894
## [1] "Linear Regression"
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0127

the multiplive model is significant at alpha level = .10, therefore I will use the addtive decomposition for arima and arima regressors models.

the linear regression model residuals are stationary therefore, the residuals use for arima model and they combined in the end.

the regressors mentioned above is used for arima model with regressors.

Arima

## Series: decomposed$random 
## ARIMA(5,0,0) with zero mean 
## 
## Coefficients:
##           ar1      ar2      ar3      ar4      ar5
##       -0.1946  -0.4366  -0.4028  -0.3270  -0.1579
## s.e.   0.0505   0.0486   0.0493   0.0484   0.0503
## 
## sigma^2 estimated as 5.204:  log likelihood=-857.33
## AIC=1726.67   AICc=1726.89   BIC=1750.36

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(5,0,0) with zero mean
## Q* = 53.984, df = 9, p-value = 1.901e-08
## 
## Model df: 5.   Total lags used: 14

Arima with Regressor

## Series: decomposed$random 
## Regression with ARIMA(0,0,0)(0,0,2)[7] errors 
## 
## Coefficients:
##         sma1     sma2  intercept    xreg
##       0.1987  -0.1041    -0.3066  0.0013
## s.e.  0.0506   0.0502     0.2076  0.0006
## 
## sigma^2 estimated as 6.891:  log likelihood=-911.34
## AIC=1832.67   AICc=1832.83   BIC=1852.41

## 
##  Ljung-Box test
## 
## data:  Residuals from Regression with ARIMA(0,0,0)(0,0,2)[7] errors
## Q* = 99.899, df = 10, p-value < 2.2e-16
## 
## Model df: 4.   Total lags used: 14

Arima combined with linear Regression

## Series: residuals 
## ARIMA(5,0,1) with zero mean 
## 
## Coefficients:
##          ar1     ar2      ar3      ar4     ar5     ma1
##       0.9452  0.1830  -0.1834  -0.3084  0.1771  -0.944
## s.e.  0.0664  0.0679   0.0695   0.0682  0.0594   0.040
## 
## sigma^2 estimated as 1.097:  log likelihood=-567.5
## AIC=1148.99   AICc=1149.29   BIC=1176.74

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(5,0,1) with zero mean
## Q* = 2.9941, df = 4, p-value = 0.5588
## 
## Model df: 6.   Total lags used: 10

Predictions

The all models are used to predict include mul_arima and reg_mul_arima since they significant at alpha = 0.10.

##     event_date actual sqrt_forecasted_sold BoxCox_forecasted_sold
##  1: 2021-06-18      3                    1                      0
##  2: 2021-06-19      0                    0                      0
##  3: 2021-06-20      1                    2                      3
##  4: 2021-06-21      2                    2                      2
##  5: 2021-06-22      2                    1                      0
##  6: 2021-06-23      2                    1                      0
##  7: 2021-06-24      2                    1                      0
##  8: 2021-06-25      2                    1                      0
##  9: 2021-06-26      1                    0                      0
## 10: 2021-06-27      0                    0                      0
## 11: 2021-06-28      4                    1                      0
## 12: 2021-06-29      1                    3                      4
## 13: 2021-06-30      0                    0                      0
## 14: 2021-07-01      1                    1                      1
##     lm_forecasted_sold forecasted_lm8_arima add_arima_forecasted
##  1:                 -1                    0                    2
##  2:                 -1                   -1                    2
##  3:                  1                    2                    3
##  4:                  1                    2                    3
##  5:                  1                    0                    2
##  6:                  1                    1                    2
##  7:                  0                    0                    2
##  8:                  0                    0                    1
##  9:                  0                    1                    1
## 10:                 -1                    0                    1
## 11:                  1                    1                    1
## 12:                  2                    2                    2
## 13:                 -1                   -1                    2
## 14:                  2                    2                    3
##     mul_arima_forecasted reg_add_arima_forecasted reg_mul_arima_forecasted
##  1:                    2                        2                        0
##  2:                    2                        2                        0
##  3:                    1                        3                        0
##  4:                    2                        3                        5
##  5:                    2                        2                        5
##  6:                    2                        2                        3
##  7:                    2                        2                       -1
##  8:                    1                        1                        0
##  9:                    1                        1                        1
## 10:                    1                        1                        1
## 11:                    1                        1                        1
## 12:                    2                        2                        2
## 13:                    2                        2                        6
## 14:                    2                        3                        0

EROR rates

##                       model  n mean       sd        CV      FBias MAPE     RMSE
## 1:     sqrt_forecasted_sold 14  1.5 1.160239 0.7734925  0.3333333  NaN 1.281740
## 2:   BoxCox_forecasted_sold 14  1.5 1.160239 0.7734925  0.5238095  NaN 1.982062
## 3:       lm_forecasted_sold 14  1.5 1.160239 0.7734925  0.7619048  Inf 1.732051
## 4:     forecasted_lm8_arima 14  1.5 1.160239 0.7734925  0.5714286  NaN 1.603567
## 5:     add_arima_forecasted 14  1.5 1.160239 0.7734925 -0.2857143  Inf 1.463850
## 6:     mul_arima_forecasted 14  1.5 1.160239 0.7734925 -0.0952381  Inf 1.253566
## 7: reg_add_arima_forecasted 14  1.5 1.160239 0.7734925 -0.2857143  Inf 1.463850
## 8: reg_mul_arima_forecasted 14  1.5 1.160239 0.7734925 -0.0952381  NaN 2.535463
##          MAD      MADP     WMAPE
## 1: 0.9285714 0.6190476 0.6190476
## 2: 1.5000000 1.0000000 1.0000000
## 3: 1.4285714 0.9523810 0.9523810
## 4: 1.2857143 0.8571429 0.8571429
## 5: 1.1428571 0.7619048 0.7619048
## 6: 0.8571429 0.5714286 0.5714286
## 7: 1.1428571 0.7619048 0.7619048
## 8: 2.0000000 1.3333333 1.3333333

The error rates are very high, however the range of response variable too narrow, therefore, it is expected. Like if the sales = 1 and the prediction is equal= 2 the error rate will be %100.

The mul_arima_forecasted has the lowest error rate.

Next Day Prediction In every day, the error rates are calculated for last 14 days and the model predictions and the model prediction has the lowest WMAPE value of is selected.

##           add_arima           mul_arima      xreg_mul_arima      xreg_add_arima 
##           1.1646832           1.0368527           0.7060850           1.2623962 
##         forecast_lm forecast_lm_arima.1           BoxCox_lm             Sqrt_lm 
##           0.6887907           0.6155989           1.4901452           1.0000000

Product9

TrendyolMilla Bikini Top

By observing the graph below, the month effect is clearly observable. It is expected since bikini is wore in hot seasons in Turkey. Moreover, by examined the acf and pacf graph, it can be said that there is trend in data and correlation with lag1 and lag7.

the “price”,“category_sold”, “basket_count”,“category_favored” attributes are more relaible and significantly corralet with data. Even if the visit_count and favored_count is very high corraleted with data, they also corraleted with basket_count therefore they do not used in regressors.

##      price         event_date         product_content_id   sold_count    
##  Min.   :59.99   Min.   :2020-05-25   Length:404         Min.   :  0.00  
##  1st Qu.:59.99   1st Qu.:2020-09-02   Class :character   1st Qu.:  0.00  
##  Median :59.99   Median :2020-12-12   Mode  :character   Median :  0.00  
##  Mean   :60.11   Mean   :2020-12-12                      Mean   : 18.39  
##  3rd Qu.:59.99   3rd Qu.:2021-03-23                      3rd Qu.:  3.00  
##  Max.   :63.55   Max.   :2021-07-02                      Max.   :286.00  
##  NA's   :281                                                             
##   visit_count      favored_count     basket_count     category_sold   
##  Min.   :    0.0   Min.   :   0.0   Min.   :   0.00   Min.   :  20.0  
##  1st Qu.:    0.0   1st Qu.:   0.0   1st Qu.:   0.00   1st Qu.: 131.8  
##  Median :    0.0   Median :   0.0   Median :   0.00   Median : 562.5  
##  Mean   : 2460.6   Mean   : 241.1   Mean   :  88.91   Mean   :1290.3  
##  3rd Qu.:  578.5   3rd Qu.: 110.5   3rd Qu.:  19.00   3rd Qu.:1664.8  
##  Max.   :45833.0   Max.   :5011.0   Max.   :1735.00   Max.   :8099.0  
##                                                                       
##  category_brand_sold category_visits       ty_visits         category_basket  
##  Min.   :     0      Min.   :    107.0   Min.   :        1   Min.   :      0  
##  1st Qu.:     0      1st Qu.:    395.5   1st Qu.:        1   1st Qu.:      0  
##  Median :  2958      Median :   1360.5   Median :        1   Median :      0  
##  Mean   : 14053      Mean   :  80947.3   Mean   : 44617481   Mean   : 118640  
##  3rd Qu.: 15158      3rd Qu.:   2869.5   3rd Qu.:102350467   3rd Qu.: 102690  
##  Max.   :152168      Max.   :1335060.0   Max.   :178545693   Max.   :1230833  
##                                                                               
##  category_favored     w_day        mon          is_campaign     
##  Min.   :   628   Min.   :1   Min.   : 1.000   Min.   :0.00000  
##  1st Qu.:  2581   1st Qu.:2   1st Qu.: 4.000   1st Qu.:0.00000  
##  Median :  7788   Median :4   Median : 6.000   Median :0.00000  
##  Mean   : 15181   Mean   :4   Mean   : 6.463   Mean   :0.08663  
##  3rd Qu.: 16146   3rd Qu.:6   3rd Qu.: 9.000   3rd Qu.:0.00000  
##  Max.   :135551   Max.   :7   Max.   :12.000   Max.   :1.00000  
## 

The trend, lag1,lag2,lag3, and lag7 variables are added data.

model construction

the data has no constant variance therefore, besides the simple linear model, the sqrt transformation and boxcox tranformation is used for simple regression model

In product9, attributes are reliable therefore the all attributes are tried to add model and most significance ones selected for the model .

simple linear regression with no transformation

## 
## Call:
## lm(formula = sold_count ~ price + visit_count + basket_count + 
##     favored_count + category_sold + category_visits + category_basket + 
##     category_favored + category_brand_sold + factor(w_day) + 
##     factor(mon) + trend + lag1 + lag3, data = train9)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -25.328  -1.123  -0.017   1.429  31.654 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -2.531e+02  1.106e+02  -2.289 0.022663 *  
## price                4.264e+00  1.852e+00   2.303 0.021875 *  
## visit_count         -1.070e-03  6.204e-04  -1.725 0.085326 .  
## basket_count         2.042e-01  7.701e-03  26.517  < 2e-16 ***
## favored_count       -5.007e-03  4.404e-03  -1.137 0.256322    
## category_sold        5.077e-03  8.825e-04   5.753 1.88e-08 ***
## category_visits      3.727e-06  8.721e-06   0.427 0.669356    
## category_basket      3.370e-05  1.500e-05   2.246 0.025284 *  
## category_favored    -3.203e-04  8.061e-05  -3.973 8.58e-05 ***
## category_brand_sold -1.533e-04  1.245e-04  -1.232 0.218893    
## factor(w_day)2      -1.559e+00  1.129e+00  -1.381 0.168182    
## factor(w_day)3       7.781e-01  1.150e+00   0.677 0.499153    
## factor(w_day)4       1.227e-01  1.152e+00   0.107 0.915202    
## factor(w_day)5       5.869e-02  1.154e+00   0.051 0.959465    
## factor(w_day)6      -2.655e-01  1.141e+00  -0.233 0.816057    
## factor(w_day)7       5.084e-01  1.138e+00   0.447 0.655289    
## factor(mon)2        -6.969e+00  1.826e+00  -3.816 0.000159 ***
## factor(mon)3        -7.138e+00  1.762e+00  -4.050 6.28e-05 ***
## factor(mon)4        -6.955e+00  1.990e+00  -3.496 0.000532 ***
## factor(mon)5        -1.004e+01  3.877e+00  -2.590 0.009980 ** 
## factor(mon)6        -6.875e+00  3.570e+00  -1.926 0.054932 .  
## factor(mon)7        -3.589e+00  3.185e+00  -1.127 0.260672    
## factor(mon)8        -7.678e-01  2.790e+00  -0.275 0.783305    
## factor(mon)9        -1.371e+00  2.540e+00  -0.540 0.589588    
## factor(mon)10       -2.238e+00  2.286e+00  -0.979 0.328171    
## factor(mon)11       -1.977e+00  2.059e+00  -0.960 0.337572    
## factor(mon)12       -1.114e+00  1.674e+00  -0.665 0.506367    
## trend               -7.571e-03  1.437e-02  -0.527 0.598650    
## lag1                 8.940e-02  2.394e-02   3.734 0.000219 ***
## lag3                 8.024e-02  1.817e-02   4.417 1.33e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.856 on 359 degrees of freedom
## Multiple R-squared:  0.9861, Adjusted R-squared:  0.985 
## F-statistic: 877.8 on 29 and 359 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 33
## 
## data:  Residuals
## LM test = 174.56, df = 33, p-value < 2.2e-16

The Adjusted R-squared value is very high and residuals seems to no autocorraled arround the mean zero. The model is a can be good fit.

Simple Linear Regression Model with sqrt transformation

## 
## Call:
## lm(formula = sqrt ~ price + visit_count + basket_count + favored_count + 
##     category_sold + category_visits + category_basket + category_favored + 
##     category_brand_sold + ty_visits + factor(w_day) + factor(mon) + 
##     lag1 + lag3, data = train9)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.3550 -0.2360 -0.0627  0.1745  4.8302 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -1.441e+01  1.309e+01  -1.101  0.27153    
## price                2.393e-01  2.176e-01   1.100  0.27227    
## visit_count          4.678e-05  7.203e-05   0.649  0.51647    
## basket_count         1.017e-02  9.330e-04  10.904  < 2e-16 ***
## favored_count       -6.264e-04  4.933e-04  -1.270  0.20494    
## category_sold        5.239e-04  1.078e-04   4.860 1.76e-06 ***
## category_visits      2.675e-06  8.506e-07   3.145  0.00180 ** 
## category_basket      2.998e-06  1.912e-06   1.568  0.11774    
## category_favored    -4.896e-05  9.317e-06  -5.255 2.54e-07 ***
## category_brand_sold -4.739e-06  1.571e-05  -0.302  0.76309    
## ty_visits            1.469e-08  3.011e-09   4.878 1.61e-06 ***
## factor(w_day)2       1.480e-01  1.361e-01   1.088  0.27753    
## factor(w_day)3       2.979e-01  1.380e-01   2.158  0.03155 *  
## factor(w_day)4       3.222e-01  1.386e-01   2.325  0.02066 *  
## factor(w_day)5       3.394e-01  1.387e-01   2.448  0.01486 *  
## factor(w_day)6       3.425e-01  1.376e-01   2.489  0.01325 *  
## factor(w_day)7       2.680e-01  1.365e-01   1.964  0.05034 .  
## factor(mon)2        -9.714e-02  3.247e-01  -0.299  0.76499    
## factor(mon)3        -1.092e+00  2.771e-01  -3.941 9.76e-05 ***
## factor(mon)4        -2.470e+00  2.943e-01  -8.390 1.13e-15 ***
## factor(mon)5        -9.352e-01  3.219e-01  -2.906  0.00389 ** 
## factor(mon)6        -2.216e-01  2.781e-01  -0.797  0.42599    
## factor(mon)7        -2.571e-02  2.692e-01  -0.096  0.92397    
## factor(mon)8         1.470e-01  2.327e-01   0.632  0.52797    
## factor(mon)9        -4.711e-02  2.250e-01  -0.209  0.83426    
## factor(mon)10       -2.080e-01  2.245e-01  -0.927  0.35465    
## factor(mon)11       -2.063e-01  2.255e-01  -0.915  0.36095    
## factor(mon)12       -1.924e-01  1.960e-01  -0.981  0.32704    
## lag1                 7.630e-03  2.887e-03   2.642  0.00859 ** 
## lag3                 3.889e-03  2.143e-03   1.815  0.07036 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7077 on 359 degrees of freedom
## Multiple R-squared:  0.9687, Adjusted R-squared:  0.9662 
## F-statistic: 383.2 on 29 and 359 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 33
## 
## data:  Residuals
## LM test = 179.08, df = 33, p-value < 2.2e-16

The sqrt tranformation is also good fit model by R-squared value and residual analysis, However, it has lower R-squared value than no transformation model.

BoxCox Transformation

## 
## Call:
## lm(formula = BoxCox ~ price + visit_count + basket_count + favored_count + 
##     category_visits + category_basket + ty_visits + factor(w_day) + 
##     factor(mon) + lag1 + lag3, data = train9)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.8438 -0.3777 -0.0468  0.3162  7.4378 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     -1.566e+01  2.409e+01  -0.650  0.51613    
## price            2.039e-01  4.009e-01   0.509  0.61138    
## visit_count      2.201e-04  1.253e-04   1.757  0.07969 .  
## basket_count     7.421e-03  1.593e-03   4.658 4.48e-06 ***
## favored_count   -1.526e-03  7.124e-04  -2.142  0.03282 *  
## category_visits  1.860e-06  9.205e-07   2.021  0.04406 *  
## category_basket  3.343e-06  1.030e-06   3.248  0.00127 ** 
## ty_visits        3.306e-08  5.284e-09   6.257 1.11e-09 ***
## factor(w_day)2   3.704e-01  2.515e-01   1.473  0.14164    
## factor(w_day)3   5.500e-01  2.521e-01   2.181  0.02979 *  
## factor(w_day)4   7.521e-01  2.526e-01   2.977  0.00311 ** 
## factor(w_day)5   7.995e-01  2.518e-01   3.175  0.00162 ** 
## factor(w_day)6   7.057e-01  2.535e-01   2.784  0.00564 ** 
## factor(w_day)7   5.575e-01  2.523e-01   2.209  0.02777 *  
## factor(mon)2     6.805e-01  5.955e-01   1.143  0.25392    
## factor(mon)3    -1.121e+00  5.107e-01  -2.196  0.02872 *  
## factor(mon)4    -4.796e+00  5.381e-01  -8.912  < 2e-16 ***
## factor(mon)5    -1.438e+00  4.918e-01  -2.924  0.00368 ** 
## factor(mon)6     4.299e-02  3.332e-01   0.129  0.89739    
## factor(mon)7    -5.215e-01  3.350e-01  -1.557  0.12041    
## factor(mon)8    -4.757e-01  3.349e-01  -1.420  0.15637    
## factor(mon)9    -5.032e-01  3.378e-01  -1.490  0.13714    
## factor(mon)10   -5.092e-01  3.348e-01  -1.521  0.12920    
## factor(mon)11   -4.708e-01  3.376e-01  -1.395  0.16397    
## factor(mon)12   -5.100e-01  3.350e-01  -1.522  0.12878    
## lag1             9.649e-03  5.306e-03   1.818  0.06982 .  
## lag3             4.296e-03  3.952e-03   1.087  0.27766    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.31 on 362 degrees of freedom
## Multiple R-squared:  0.9339, Adjusted R-squared:  0.9291 
## F-statistic: 196.6 on 26 and 362 DF,  p-value: < 2.2e-16

## 
##  Breusch-Godfrey test for serial correlation of order up to 30
## 
## data:  Residuals
## LM test = 176.24, df = 30, p-value < 2.2e-16

BoxCox transformation is also can be good fit model since the adjusted R-square value high.

In all lm models the residuals is significantly corraleted in lag1 it is not desirable.

Arima Models

Arima Models

When arima models is constructed, the auto.arima function is used, and in every day the auto.arima function is runs again. the seasonality is TRUE, and frequency is determined as seven by observing ACF and PACF graph.

Additive Model, Multplive Model, and linear regression model is used for decomposition and get stationary data.

## [1] "The Additive Model"
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0074
## [1] "The Multiplicative Model"
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0767
## [1] "Linear Regression"
## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0266

I used the addtive model in examination, however, the mul model is also used in predictions and calculated error rate since it is significant at level = 0.05

Arima

## Series: decomposed$random 
## ARIMA(0,0,2)(0,0,2)[7] with zero mean 
## 
## Coefficients:
##          ma1      ma2    sma1    sma2
##       0.0175  -0.2200  0.1261  0.1426
## s.e.  0.0676   0.0786  0.0562  0.0597
## 
## sigma^2 estimated as 101.4:  log likelihood=-1426.16
## AIC=2862.31   AICc=2862.47   BIC=2882.05

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,2)(0,0,2)[7] with zero mean
## Q* = 63.195, df = 10, p-value = 8.962e-10
## 
## Model df: 4.   Total lags used: 14

Arima with Regressor

## Series: decomposed$random 
## Regression with ARIMA(0,0,2)(1,0,2)[7] errors 
## 
## Coefficients:
##           ma1      ma2     sar1    sma1    sma2  intercept     xreg
##       -0.0989  -0.4228  -0.8466  0.9099  0.2316   336.1997  -5.5940
## s.e.   0.0874   0.1040   0.0731  0.0875  0.0586   131.6755   2.1907
## 
## sigma^2 estimated as 98.22:  log likelihood=-1419.66
## AIC=2855.32   AICc=2855.71   BIC=2886.9

## 
##  Ljung-Box test
## 
## data:  Residuals from Regression with ARIMA(0,0,2)(1,0,2)[7] errors
## Q* = 75.554, df = 7, p-value = 1.107e-13
## 
## Model df: 7.   Total lags used: 14

Arima comined with Linear Regression

## Series: residuals 
## ARIMA(1,0,0) with non-zero mean 
## 
## Coefficients:
##          ar1    mean
##       0.1642  0.0050
## s.e.  0.0500  0.3364
## 
## sigma^2 estimated as 30.95:  log likelihood=-1218.56
## AIC=2443.13   AICc=2443.19   BIC=2455.02

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,0) with non-zero mean
## Q* = 17.51, df = 8, p-value = 0.02522
## 
## Model df: 2.   Total lags used: 10

Predictions

##     event_date actual sqrt_forecasted_sold BoxCox_forecasted_sold
##  1: 2021-06-18     46             39.60757               24.49287
##  2: 2021-06-19     26             40.73874               27.92878
##  3: 2021-06-20     15             37.68012               32.82206
##  4: 2021-06-21     20             18.12076               15.49253
##  5: 2021-06-22     47             19.00498               15.18145
##  6: 2021-06-23     40             23.86390               19.88549
##  7: 2021-06-24     37             22.30142               18.94182
##  8: 2021-06-25     20             21.13676               15.11249
##  9: 2021-06-26     27             15.26509               10.20279
## 10: 2021-06-27     20             29.71328               23.90117
## 11: 2021-06-28     26             16.29134               15.25011
## 12: 2021-06-29     19             29.99692               31.42128
## 13: 2021-06-30     20             29.01757               30.40515
## 14: 2021-07-01     14             20.43026               18.36339
##     lm_forecasted_sold forecasted_lm9_arima add_arima_forecasted
##  1:           52.06382             48.22485             53.63868
##  2:           53.60874             51.82412             53.17379
##  3:           25.62132             21.49631             55.07792
##  4:           28.82384             31.34988             50.73318
##  5:           48.30872             42.48197             39.35532
##  6:           41.30296             42.75482             40.53021
##  7:           38.18560             35.97025             37.48216
##  8:           32.00821             34.95974             33.04259
##  9:           26.10528             23.68064             28.24629
## 10:           18.14426             20.09753             31.74235
## 11:           18.17181             16.38052             32.17666
## 12:           29.19631             30.60151             28.66079
## 13:           32.02116             29.43881             25.90133
## 14:           27.17501             26.75307             24.36589
##     mul_arima_forecasted reg_add_arima_forecasted reg_mul_arima_forecasted
##  1:             47.78079                 53.62648                 49.07409
##  2:             38.06669                 53.13962                 37.64727
##  3:             74.23554                 55.08468                 74.99427
##  4:             37.16847                 50.77435                 46.26403
##  5:             32.56984                 39.42061                 31.35943
##  6:             53.92517                 40.56712                 53.99654
##  7:             38.03248                 37.93881                 39.39668
##  8:             27.91288                 33.47772                 29.19408
##  9:             23.01908                 28.63979                 23.45646
## 10:             40.43471                 32.14585                 41.19011
## 11:             22.61740                 32.60472                 23.04340
## 12:             23.99802                 29.05841                 24.46400
## 13:             35.12226                 26.30029                 35.80710
## 14:             24.92684                 24.74350                 25.40014

EROR RATES

##                       model  n     mean       sd       CV       FBias      MAPE
## 1:     sqrt_forecasted_sold 14 26.92857 11.11108 0.412613  0.03668779 0.4676869
## 2:   BoxCox_forecasted_sold 14 26.92857 11.11108 0.412613  0.20583193 0.4702745
## 3:       lm_forecasted_sold 14 26.92857 11.11108 0.412613 -0.24863937 0.3958312
## 4:     forecasted_lm9_arima 14 26.92857 11.11108 0.412613 -0.20958624 0.3910197
## 5:     add_arima_forecasted 14 26.92857 11.11108 0.412613 -0.41678291 0.6196845
## 6:     mul_arima_forecasted 14 26.92857 11.11108 0.412613 -0.37880681 0.6777079
## 7: reg_add_arima_forecasted 14 26.92857 11.11108 0.412613 -0.42578766 0.6306575
## 8: reg_mul_arima_forecasted 14 26.92857 11.11108 0.412613 -0.41986102 0.7308202
##        RMSE       MAD      MADP     WMAPE
## 1: 13.64698 11.661328 0.4330467 0.4330467
## 2: 15.27308 12.805877 0.4755498 0.4755498
## 3: 10.77788  8.206740 0.3047596 0.3047596
## 4: 10.64224  8.284804 0.3076585 0.3076585
## 5: 16.88150 12.315465 0.4573382 0.4573382
## 6: 19.33931 13.314110 0.4944232 0.4944232
## 7: 16.98525 12.548624 0.4659967 0.4659967
## 8: 20.43287 14.469216 0.5373184 0.5373184

Next Day Prediction

In every day, the error rates are calculated for last 14 days and the model predictions and the model prediction has the lowest WMAPE value of is selected.

##           add_arima           mul_arima      xreg_mul_arima      xreg_add_arima 
##            18.76924            15.58620            15.89121            19.15304 
##         forecast_lm forecast_lm_arima.1           BoxCox_lm             Sqrt_lm 
##            22.41879            15.34642            17.04839            18.15753

CONCLUSION

In order to predict one day ahead sales of the different products, different ARIMA and Linear Regression models have been tried and according to their performance on the test set, which consists of dates from 29 May 2021 to 11 June 2021, different models have been selected for each product. As external data, campaign dates of Trendyol is included, however since every campaign that of Trendyol is not included in the website, some of the outlier may have not been explained more correctly in the models, in order to improve the models, further investigation may be held. Also, the sales are affected from the overall component of the economy, so more external data could be included such as dollar exchange rate, for improved accuracy.

Approaching differently to each product is one of the strong sides of the model, since it is a time consuming task. Also trying various models and measuring their performances based on their predictions on the test data is also a strong side of the models that have been proposed for each product.

Overall, it can be said that models work fine, deviation from the real values is not too big.

REFERENCES

Lecture Notes